The following is my reaction to today’s announcement from the Speaker on the availability of bill XML in bulk from the Government Printing Office. It’s adapted from the email I sent to Nick Judd for his article on the data. The part about institutionalizing transparency was really Daniel Schuman’s idea — sorry I didn’t attribute that! [Update: Also see Alex Howard's article.]
What we’re seeing with the bills bulk data project is how the wave of culture change is moving through government. Over the last two years the House Republican leadership has embraced open government in many ways (my 112th Congress recap | the new House floor feed). With this bills XML project, we’re seeing more legislative support agencies being involved in how the House does open government.
This isn’t a technical feat by any means, but it is a cultural feat. The House and GPO worked together to institutionalize a new way for the House to publish bulk data.
Because of the way Data.gov is managed in the executive branch, we’ve become accustomed to big announcements. The bills bulk data project and the other recent projects show that the House is taking a different approach, an incremental approach, to open government data: publish early and often, gather feedback, then go on to bigger projects. This is something open government advocates have been asking for.
As I mentioned, the tech side itself is not much. They took files they and the Library of Congress already make available (and in some sense already in bulk) and zipped them up into up to 16 ZIP files. (4 files now, but that will probably grow to 16 by the end of the Congress.) So there’s no new data here, and thus not the data that the bulk legislative data advocates have been asking for. But it’s on the road to that. The files involved in this project have the text of legislation but not bill status, which is what the bulk data advocates have been asking for.
There is one thing crucial missing from this, and that’s that there is no feedback loop with the users of this data. The incremental approach can’t work unless the users of the data have a way to tell GPO what is and is not working. There is no public point of contact for these files, and I don’t even know a private point of contact at GPO.
But that doesn’t detract from the fact that this is a good step forward.