Open Data and imagination
Two very interesting questions about open data and government from Bernard Jenkin MP in the context of threats to open data:
Q129 Chair: How much is your anxiety about a failure of imagination, rather than a failure of leadership?
Q112 Chair: Do you think there is a role for the UK Statistics Authority in stipulating how there should be a routine publication schedule of data sets, so that they come out like GDP data or unemployment data and then industry and other users have predictability, which they do not have at the moment?
But the biggest threats we’ve seen so far are not too little imagination from Government, but perverse incentives pushing for greater imagination to support narrow interests.
Tom Steinberg was completely right that Treasury blocked a number of open data releases simply because it wasn’t politically or financially convenient, even if it was a good idea under other policies. That’s what Treasury does. Other departments have different objectives, and seek to use open data as a tool to achieve them.
Why did HMRC run a consultation recently on opening up the list of valid VAT registration numbers to the credit reference agencies (only)? Because someone at a credit reference agency had a chat to an HMRC commissioner, and made an open data (but not too open data) argument. That took imagination. It’s not a good idea, but it’s not lacking in imagination.
The Department for Education wanted to open access to the National Pupil Database last year, which is every child in the country linked from pre-school to university, and ran a consultation with the primary aim of ensuring that even more information can be shared with even more people for purposes “not necessarily related to educational achievement”, including the media and companies selling “data based products and services”. That wasn’t a failure of imagination on the part of a DfE Special Advisor, that was an excess of it. It was creative thinking, to serve a political goal by staff who are political appointees to do just that.
How easy is it to get wrong? The NHS announces a programme of centralisation of every citizen’s medical record, and Wellcome Trust tweets welcoming the “opening up” of data.
The Committee heard evidence that the agenda is most pushed by a few Ministers and staff who truly believe in it, and have concern that such a form of momentum can easily be lost, but it can equally be hijacked, and have unintended consequences.
The current head of Patients and Information at the NHS used to be Director of Transparency at the Cabinet Office, responsible for Open Data. It’s little wonder that the NHS is driving forwards with releasing more open data, but is doing so via a data grab of the medical records of every individual in the country, to produce that data.
To quote Tom Steinberg, speaking in 2010, “Just one bad error here, will be so much bigger than all the pro-open data stories”. I suspect building a single database of everyone’s health interactions so you can produce open data may not be the best strategy, but it’s more publicly palatable than the other places it will go “such organisations may include research bodies, information intermediaries, companies, charities, and others” and “think-tanks”.
So the question discussed in Q112, of how this should be resolved, what the process should be, should be about rejecting as well as enabling data requests. There is currently no process for that, and as various conversations recently have proved, there should be. Current processes don’t work.
On a related point, there are many terms that mean different things. I’m not entirely sure that Stephan Shakespeare quite knew the context of the terms when he spoke about the NHS safe-havens. They’re not technology, but physical spaces, which in some cases, can simply be the area of a fax machine on a shelf:
Q120: Stephan Shakespeare: We have what is called safe-haven technology, which means you can make data available in a way that you cannot take it out of the box, if you like, and you can access it remotely without removing it from the database.
All true, but that’s not quite what the words mean in the context about which he was discussing them.