<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Joshua Tauberer's Blog</title>
	<atom:link href="http://razor.occams.info/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://razor.occams.info/blog</link>
	<description></description>
	<lastBuildDate>Fri, 10 May 2013 19:49:00 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.5</generator>
		<item>
		<title>New Open Data Memorandum almost defines open data, misses mark with open licenses</title>
		<link>http://razor.occams.info/blog/2013/05/09/new-open-data-memorandum-almost-defines-open-data-misses-mark-with-open-licenses/</link>
		<comments>http://razor.occams.info/blog/2013/05/09/new-open-data-memorandum-almost-defines-open-data-misses-mark-with-open-licenses/#comments</comments>
		<pubDate>Thu, 09 May 2013 17:12:38 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=604</guid>
		<description><![CDATA[TL:DR: The new E.O. and memorandum are good for transparency and lock in almost all of the generally accepted notions of open government data. But it misses the mark on the requirement of &#8220;open licenses.&#8221; With an executive order and a new Memorandum on Open Data Policy today, the focus on entrepreneurship remained at the forefront [...]]]></description>
			<content:encoded><![CDATA[<p><em>TL:DR: The new E.O. and memorandum are good for transparency and lock in almost all of the generally accepted notions of open government data. But it misses the mark on the requirement of &#8220;open licenses.&#8221;</em></p>
<p>With an <a href="http://www.whitehouse.gov/the-press-office/2013/05/09/executive-order-making-open-and-machine-readable-new-default-government-">executive order</a> and a new <a href="http://www.whitehouse.gov/sites/default/files/omb/memoranda/2013/m-13-13.pdf">Memorandum on Open Data Policy</a> today, the focus on entrepreneurship remained at the forefront of federal data policy. This focus began with last year&#8217;s Digital Government Strategy, and these days weather data and GPS signals are the examples of choice. That said, the policies set in the new memorandum are quite good for the classic use of this data (transparency, accountability, and civic education) even if &#8220;transparency&#8221; is only barely mentioned in passing.</p>
<p><em>Defining Open Data: How well does it do?</em></p>
<p>This new Open Data Memorandum presents the most detailed definition to date of “open data” by the federal government. It included many of the principles that our community has reached consensus on, but it gets one severely wrong.</p>
<p>As <a href="http://razor.occams.info/blog/2009/12/09/open-government-directive-evaluation-on-principles/">I wrote many years ago</a>, the 2009 Open Government Directive itself already adopted some of the principles of open government data including: online, primary, timely, public input, and public review. It also added two principles of its own: being pro-active about data release and creating accountability by designating an official responsible for data quality.</p>
<p>Comparing to <a href="http://opengovdata.io/">my list of open government data principles in my book</a>, the new memorandum’s definition of open data covers:</p>
<ul>
<li>Principle 1: Information should be online (to quote the Memorandum: “retrieved, downloaded”)</li>
<li>Principle 2: Primary (the Memorandum even uses language from the 8 Principles; interestingly the memorandum places this under the heading of “Complete,” which was a different principle from the original 8 Principles).</li>
<li>Principle 3: Timely.</li>
<li>Principle 4: Accessible (the Memorandum repeats the language from the 8 Principles, “available to the widest range of users for the widest range of purposes” and the use of “multiple formats” where necessary, and for documentation says the data should be “described”).</li>
<li>Principles 5 and 10: Analyzable (“machine readable”).</li>
<li>Principle 6: Non-discriminatory</li>
<li>Principle 7: Non-proprietary (open) data formats</li>
<li>Principle 14: Public review (“A point of contact must be designated to assist with data use and to respond to complaints about adherence to these open data requirements.”)</li>
</ul>
<p>Its definition also states that open data has a presumption of openness. (Principles 2-7 and 14 are from the <a href="http://www.opengovdata.org/">8 Principles of Open Government Data</a>. Principle 1 is from the Sunlight Foundation.)</p>
<p>Elsewhere in the memorandum it addresses:</p>
<ul>
<li>Principle 13: Public input (“engage with customers” for prioritizing what data should be made available and how to make it available)</li>
<li>Principle 15. Interagency coordination (“interoperability”)</li>
</ul>
<p>It also asks agencies to create data catalogs to include datasets “that can be made publicly available but have not yet been released” at agency.gov/data URLs. And it says agencies must consider the needs of open data at all stages of the information collection lifecycle. In other words, data should be collected in such a way as to promote public dissemination of open data later on.</p>
<p>The Memorandum misses the principle that data should be license-free, which is a core principle and a grave mistake. It also misses the peripheral principles of permanence, the use of safe file formats, and practices of provenance and trust (e.g. digital signatures). (These last two are <a href="http://www.acm.org/public-policy/open-government">ACM principles</a>.)</p>
<p><em>&#8220;Open licenses&#8221; presume access is closed by default!</em></p>
<p>Rather than requiring open data to be license-free, which was a core part of the <a href="http://www.opengovdata.org/">8 Principles of Open Government Data</a>, it instead promotes the use of “open licenses.” This is a subtle but important distinction. Licenses presume data rights. Open licenses, including open source licenses and Creative Commons licenses, create limited privileges in a world where the default is closed. These licenses create possibilities of use that do not exist in the absence of the license because copyright law, or other law, creates an initial state of closedness.</p>
<p>Most open licenses only grant some privileges but not others, and some privileges come along with new requirements. The GPL and Creative Commons Attribution License, for instance, rely on copyright law so that restrictions on data use intended by the open license (GPL’s virality clause, or the restriction that users must attribute the work to the author) are enforceable in court.</p>
<p>Federal government data is not typically subject to copyright law, and in this case a license is not needed for the data to be open. Thus the application of a license suggests a change from the open-by-default state of this data to a closed-by-default state where a license is required to open it up. While the memorandum requires “an open license that places no restrictions on their [the dataset's] use,” the term “open license” is typically understood to presume a default closed state. This policy opens the door (so to speak) to agencies applying licenses (i.e. new contractual agreements) to data that serve only to restrict use.</p>
<p>Federal government data not subject to copyright cannot be free if a license is applied. The license-free principle of the original 8 Principles says open government data cannot be limited in this way.</p>
<p>When data may be subject to copyright protection (copyright law is murky and there are many gray areas), or when copyright law definitely applies (such as to documents produced originally by federal government contractors), then a public domain dedication such as the <a href="http://creativecommons.org/publicdomain/zero/1.0/">Creative Commons CC0</a> statement or the Open Data Commons Public Domain Dedication and License (PDDL) (both of which combine a waiver and a license) is appropriate. A public domain dedication differs from an open license in that it disclaims copyright and other protections, whereas, again, an open license implies that such a limitation on use is already present. The CC0 statement was successfully used by the Council of the District of Columbia to disclaim copyright over data files containing the DC Code.</p>
<p><em>What&#8217;s the definition used for?</em></p>
<p>While the definition of open data is otherwise quite strong, the definition is used just once in the whole memorandum. The memorandum does not mandate that government data be open data under its definition, at least as far as I could see. The only use of the open data definition is in its request for agencies to create roles for staff to ensure data released to the public are open. That is, staff should promote open data, but open data itself is not required.</p>
<p>Although the definition itself is not used much, there are independent provisions that repeat some of the same principles. Agencies must use &#8220;machine-readable and open formats,&#8221; existing standards, and metadata. And information collection should be done in a way to support information dissemination: &#8220;[A]gencies must design new information collection and creation efforts so that the information collected or created supports downstream interoperability between information systems and dissemination of information to the public.&#8221;</p>
<p>It also requires the use of open licenses:</p>
<blockquote><p>&#8220;Agencies must apply open licenses, in consultation with the best practices found in Project Open Data, to information as it is collected or created so that if data are made public there are no restrictions on copying, publishing, distributing, transmitting, adapting, or otherwise using the information for non-commercial or for commercial purposes.&#8221;</p></blockquote>
<p>As I mentioned, federal-government-created data needs no license to be open, although the memorandum implies that all agency data should have an open license. (That&#8217;s either legally impossible or it means something usual.) For other data, it appears that the memorandum intends to create a public-domain-like state. But it is qualified, for contracts may only use &#8220;existing clauses&#8221; (i.e. standard contract terms already approved by OMB) to implement terms of open licensing. Looking over those terms, I don&#8217;t see the necessary legal framework to do it. And a nearby footnote confusingly says that a data user who modifies the data &#8220;is responsible for&#8221; describing the change. Does that mean an &#8220;open license&#8221; can require users to describe modifications? The qualifications make it very difficult to know what an acceptable implementation of open licensing looks like.</p>
<p><em>Conclusion</em></p>
<p>While the goals of the Memorandum in defining open data and using open licenses are laudable, the implementation does not meet the 8 Principles&#8217;s requirements of open government data, at least under the usual understanding of “open license,” and the use of the definition to promote open data is very limited.</p>
<p>PS. As Derek Willis <a href="https://twitter.com/derekwillis/status/332496392054513665">points out</a> over Twitter, the &#8220;mosaic effect&#8221; paragraphs in the memorandum are also somewhat concerning. The mosaic effect is hard to quantify and therefore difficult to limit, and this creates a big hole for keeping data government out of public reach.</p>
<p>UPDATE 5/10/2013 #1:</p>
<p>Rufus Pollock points out that the Open Data Commons Public Domain Dedication and License (PDDL) is similar to CC0 and would also be appropriate. I agree.</p>
<p>Eric Mill notes that for data already in the public domain, the Creative Commons Public Domain Mark, which is basically an icon/badge, would be appropriate. Agencies should definitely mark public domain data as such.</p>
<p>UPDATE 5/10/2013 #2:</p>
<p>I added a few paragraphs to the section now called &#8220;<em>What&#8217;s the definition used for?&#8221;.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2013/05/09/new-open-data-memorandum-almost-defines-open-data-misses-mark-with-open-licenses/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>DC opens its &#8220;code&#8221;, embracing principles of open laws</title>
		<link>http://razor.occams.info/blog/2013/04/04/dc-opens-its-code-embracing-principles-of-open-laws/</link>
		<comments>http://razor.occams.info/blog/2013/04/04/dc-opens-its-code-embracing-principles-of-open-laws/#comments</comments>
		<pubDate>Fri, 05 Apr 2013 01:55:12 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=590</guid>
		<description><![CDATA[This morning DC&#8217;s legal code went online as open data. I&#8217;ve worked with government before on open data, but never have I worked with a government body that moved so deftly through the technical, policy, and legal issues as the DC Council&#8217;s Office of the General Counsel. So, before anything else, thanks to the general [...]]]></description>
			<content:encoded><![CDATA[<p>This morning DC&#8217;s legal code <a href="http://dccouncil.us/UnofficialDCCode">went online as open data</a>. I&#8217;ve worked with government before on open data, but never have I worked with a government body that moved so deftly through the technical, policy, and legal issues as the DC Council&#8217;s Office of the General Counsel. So, before anything else, thanks to the general counsel V. David Zvenyach and his staff for their time and expertise on this.</p>
<p>The TL;DR version goes like this:</p>
<p><a href="http://macwright.org/">Tom MacWright</a> wanted to build his own version of the DC Code website. The DC Council couldn&#8217;t share its electronic copy of the Code because it contained intellectual property owned by West. This became a little and very geeky controversey (spurred by Carl Malamud). But Zvenyach &#8212; the general counsel &#8212; recognized the value of making the law open and did it. He removed the West IP from their electronic copy of the Code (I helped), posted the file on the Council&#8217;s website, and even included a <a href="http://creativecommons.org/publicdomain/zero/1.0/">CC0</a> public domain dedication</p>
<p>The last bit all happened within a matter of days, and it was one of the easiest open data success stories I&#8217;ve been a part of. Tom recapped the events <a href="http://macwright.org/2013/04/04/the-open-code.html">here</a> and began <a href="https://github.com/openlawdc">hacking the code</a> immediately. He held a hacakthon on April 14 which he wrote about <a href="http://macwright.org/2013/04/16/dc-code-hackathon.html">here</a> (and Eric Mill wrote about <a href="http://sunlightfoundation.com/blog/2013/04/15/what-happens-when-you-open-the-dc-code/">here</a>).</p>
<p>DC is setting an example for other jurisdictions. In terms of the <a href="https://law.resource.org/index.law.gov.html">10 Principles of Law.Gov</a>, DC&#8217;s bulk law download &#8212; achieved within only a few days of work &#8212; satisfies principles of no-charge to access (1), no copyright or terms of use (2), data in bulk (3), and, to some extent, machine processability (8).</p>
<p>Here&#8217;s the longer version:</p>
<p>This all began a few months ago when DC-based civic hacker <a href="http://macwright.org/">Tom MacWright</a> took an interest in making local law more accessible. Intending to import the DC Code into Waldo Jaquith&#8217;s <a href="http://www.statedecoded.com/">State Decoded</a> project, he ran into a small problem: he couldn&#8217;t get a complete copy of the law. Intellectual property issues prevented the DC Council from simply emailing over their copy of the Code.</p>
<p>Many states, like the District, contract out the codification and code-publishing work to a third-party like West (owned by the Canadian-owned Thomson Reuters) or Lexis (owned by the Amsterdam-based Reed Elsevier). DC had previously contracted out to West, and last year switched to Lexis. Neither likes to share. DC&#8217;s official website to read the Code &#8212; which has been run by West &#8212; is free to the public, but copying any part of the Code off of that website might violate West&#8217;s copyright or terms of service, or both. Sharing the law might have been illegal.</p>
<p>In the case here in DC, the DC Council had Word documents containing the Code, given to them by their contractor West, but the documents contained West&#8217;s logo. The DC Council could not share the documents with West&#8217;s logo intact. And it wasn&#8217;t easy to take those logos out (more on that later). Informally speaking, West owned the DC Code.</p>
<p>I had met Zvenyach, the general counsel, before. He is very technologically savvy and has been trying to modernize the office he took over only a few years ago. We had even talked about holding a hackathon to help him do it. (As a DC resident, I&#8217;m also interested in DC law.)  But his office, like all of government, is bound by limited resources and much work to do. When Tom brought the issue onto Zvenyach&#8217;s radar, I don&#8217;t believe there was any point at which Zvenyach didn&#8217;t want to make the files available. It was, as far as I&#8217;ve observed, merely a matter of time and resources.</p>
<div>Tom wrote more about the intellectual property issues <a href="http://macwright.org/2013/02/13/the-code-compiled.html">here</a> and <a href="http://macwright.org/2013/02/14/the-law-is-public-domain.html">here</a>. Coincidentally, on Monday Ed Walters of Fast Case gave <a href="http://reinventlawchannel.com/ed-walters-who-owns-the-law/">a great talk</a> on the issue of who owns the law at Reinvent the Law &#8212; I highly recommend watching it. He&#8217;s also <a href="http://blog.law.cornell.edu/voxpop/2011/07/15/tear-down-this-paywall/">written extensively about it</a>.</div>
<p>Tom asked Carl Malamud to get involved. Carl has <a href="https://law.resource.org/index.html">been working on this issue</a> in other states, like in Oregon, where the State of Oregon itself <a href="https://public.resource.org/oregon.gov/index.html">claimed copyright</a> over their laws. Carl bought (for quite some money) a physical copy of the DC Code, digitized it, and <a href="https://twitter.com/waldojaquith/status/317000772007116801">mailed thumb drives</a> in the shape of famous presidents containing the digitized code to various important people. This was a spin on a tactic that Carl began in the 1990s when he opened the SEC&#8217;s corporate filings data: get the data online, pressure the government to put the data online themselves, and then help the government take over that responsibility.</p>
<p>The media and bloggers caught on, beginning I think with <a href="http://boingboing.net/2013/03/27/municipal-codes-of-dc-free-fo.html">Corey Doctorow</a> on March 27, followed by <a href="http://dcist.com/2013/03/sitting_in_front_of_me.php">DCist</a> on March 28, <a href="http://www.washingtontimes.com/news/2013/mar/31/ignorance-of-dcs-copyrighted-laws-can-be-costly/">The Washington Times</a> on March 31, <a href="https://freedom-to-tinker.com/blog/sjs/the-district-of-columbia-claims-copyright-on-the-law/">Steve Schultze</a> on April 1, and <a href="http://thinkprogress.org/justice/2013/04/03/1808201/dc-code-malamud-copyright/">Think Progress</a> on April 3. The files themselves went up on April 4, so little more than a week from the first media blog post about it, and the decision to put the files up with a CC0 license was made in any case some days earlier. It really did not take much pressure at all. (Tom also wrote a post on <a href="http://greatergreaterwashington.org/post/18132/dcs-laws-arent-yours/">Greater Greater Washington</a> on March 19.)</p>
<p>Carl had noticed early on that the <a href="https://twitter.com/carlmalamud/status/311159259746410497">DC Council asserted copyright over the Code</a>. Some of the media reports focused on that. As Zvenyach explained in The Washington Times article, the rationale was to protect DC from West, by making sure West could not claim copyright over the same Code, not to limit access to the law. Whether or not <a href="https://blogs.law.harvard.edu/infolaw/2008/04/16/can-states-copyright-their-statutes/">state codes can be copyrighted</a> was mostly besides the point, and the focus on this issue turned out to be a red herring. It was resolved quickly with the choice of the Creative Commons <a href="http://creativecommons.org/publicdomain/zero/1.0/">CC0</a>, a public domain dedication.</p>
<p>I went in to Zvenyach&#8217;s office on April 3 to help them take West&#8217;s logo out of the Word documents. There was one document per title of the Code, or about 50 documents, many in the 50-megabyte size range. The West logo was in the header, but the header was specified independently for each section of the code, so in reality there were thousands of logos to take out. We also took out a DC copyright line from the documents, which was also repeated in each section.  It took about 4 hours for Microsoft Word to process all of the files, and 1 hour for us to figure out how to do it so &#8220;quickly.&#8221;</p>
<p>When I left Zvenyach&#8217;s office that evening, Zvenyach pointed out the presidential thumb drive still sitting on his desk that he received from Carl &#8212; unfortunately I forget if it was a little George Washington or a little Abraham Lincoln. I have a feeling that thumb drive will be around for a while.</p>
<p>Now, there is a bigger issue here. There&#8217;s no plan for updating the public files. DC&#8217;s contract with Lexis going forward doesn&#8217;t require Lexis to provide DC with an electronic copy of the code. Perhaps after this they&#8217;ll refuse to do so. But we&#8217;ll tackle this another time.</p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2013/04/04/dc-opens-its-code-embracing-principles-of-open-laws/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Public Comment to the House Appropriations Legislative Branch Subcommittee for FY2014</title>
		<link>http://razor.occams.info/blog/2013/03/18/public-comment-to-the-house-appropriations-legislative-branch-subcommittee-for-fy2014/</link>
		<comments>http://razor.occams.info/blog/2013/03/18/public-comment-to-the-house-appropriations-legislative-branch-subcommittee-for-fy2014/#comments</comments>
		<pubDate>Mon, 18 Mar 2013 18:14:00 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=586</guid>
		<description><![CDATA[I will be submitting the following public comment to the House Committee on Appropriations Subcommittee on the Legislative Branch regarding Public Access to Legislative Information. &#8212;&#8211; I write to urge the subcommittee to expand funding for legislative transparency. I am the president of Civic Impulse LLC, which operates the free legislative tracking service GovTrack.us. Our [...]]]></description>
			<content:encoded><![CDATA[<p><em>I will be submitting the following public comment to the House Committee on Appropriations Subcommittee on the Legislative Branch regarding Public Access to Legislative Information.</em></p>
<p>&#8212;&#8211;</p>
<p>I write to urge the subcommittee to expand funding for legislative transparency.</p>
<p>I am the president of <a href="http://www.civicimpulse.com">Civic Impulse LLC</a>, which operates the free legislative tracking service <a href="http://www.govtrack.us">GovTrack.us</a>. Our website has become an authoritative source for legislative information:</p>
<ul>
<li>More citizens turn to GovTrack.us for information about the status of legislation than the Library of Congress (LOC)‘s <a href="http://thomas.loc.gov">THOMAS</a> and <a href="http://beta.congress.gov">Congress.gov</a> websites. [See <a href="http://www.compete.com/us/">compete.com</a>.]</li>
<li>Hundreds of House and Senate staff use GovTrack.us each day.</li>
<li>More than 70 congressmen use GovTrack services to display congressional district maps and their voting record on their official website.</li>
</ul>
<p>Why is this? GovTrack.us has become the de facto authoritative source for legislative information because the Congress does not publish enough “bulk legislative data.” In 2004 we stepped in to fill the vacuum created by the lack of information coming from the Congress. It is long past due for the House to correct this problem.</p>
<p>When the Committee <a href="http://www.govtrack.us/blog/2012/06/07/rep-crenshaw-backs-down-loses-control-over-bulk-data-issue/">released a draft report</a> last year indicating it intended to have legislative branch agencies publish <em>less</em> bulk data, The Washington Post picked up on the story and wrote:</p>
<blockquote><p>“At Congress’s ’90s-vintage archive site, there’s no way to compare bills side by side. No tool to measure the success rate of a bill’s sponsor. And there’s certainly no way to leave a comment. Congress makes it hard for outside sites to do any of this, either, by refusing to give out bulk data on its bills in a user-friendly form.” (“<a href="http://www.washingtonpost.com/politics/congressional-data-may-soon-be-easier-to-use-online/2012/06/08/gJQAdikBNV_story_1.html">Congressional data may soon be easier to use online</a>,” The Washington Post, June 8, 2012.)</p></blockquote>
<p>Soon after, the Speaker and Majority Leader formed the “<a href="http://sunlightfoundation.com/blog/2012/06/06/major-transparency-milestone-in-bulk-access-statement/">Bulk Data Task Force.</a>” Since the formation of the task force, new bulk data projects have been completed at the Government Printing Office (GPO) including <a href="http://razor.occams.info/blog/2013/01/10/on-the-new-bulk-bill-xml-from-gpo/">bulk bill text</a> and at the House Clerk (<a href="http://docs.house.gov">committee schedules</a> and documents and <a href="http://clerk.house.gov/floorsummary/floor-download.aspx">bulk floor action data</a>).</p>
<p>“Bulk data” is a core component of any government information dissemination program. The House Clerk publishes roll call vote results as bulk XML data. In 2009, the Government Printing Office began offering bulk data for bill text, the Federal Register, and other publications. The Office of Law Revision Counsel publishes the United States Code in multiple bulk data formats. Bulk data can be produced at a fraction of the cost of other information dissemination methods, such as colorful websites.</p>
<p>Yet much information about the Congress remains out of public view. There is no public bulk data for the status of legislation (the LOC “BSS” database), amendments, or committee votes. I believe that eventually all official artifacts of the legislative process should be available online, free, in real time, and as structured bulk data. [See <a href="http://razor.occams.info/pubdocs/2012-08-24_bulk_data_recs.pdf">Recommendations to the Bulk Data Task Force</a>.]</p>
<p>And, sadly, proposals for cost-reduction threaten the public’s access to the law itself. A 2013 congressionally-funded report by the National Academy of Public Administration (NAPA) called for the Congress to consider<a href="http://freegovinfo.info/node/3862"> charging the public fees to read the law online</a> at GPO’s website. NAPA’s report is severely out of touch. There is no dispute that it is a moral imperative for Congress to fund programs that provide broad access to the law and other parts of the public record<em>.</em></p>
<p>GovTrack.us is a demonstration that bulk data creates broad public access and that bulk data is also the most cost-effective way to create access. Since 2004, GovTrack.us has reached tens of millions of individuals at a cost of less than $1 million.</p>
<p>The Committee can advance broad public access to legislative information by providing adequate funding for:</p>
<ul>
<li>Publishing the LOC legislative status (“BSS”) database as bulk data. [See <a href="http://razor.occams.info/pubdocs/2012-08-24_bulk_data_recs.pdf">Recommendations to the Bulk Data Task Force</a>.]</li>
<li>Enhancing GPO’s highly successful <a href="http://www.gpo.gov/fdsys/">FDSys</a> system.</li>
<li>Creating bulk data program officers at GPO, LOC, and under House Clerk.</li>
<li>Evaluating the cost and impact of legislative transparency by an organization that believes in the public’s right to primary legal documents (i.e. not NAPA).</li>
</ul>
<p>Thank you for the opportunity to submit comments on legislative branch appropriations for FY 2014.</p>
<p>Joshua Tauberer</p>
<p>President, Civic Impulse LLC</p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2013/03/18/public-comment-to-the-house-appropriations-legislative-branch-subcommittee-for-fy2014/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Open Data Day 2013 Hackathon Recap</title>
		<link>http://razor.occams.info/blog/2013/03/02/open-data-day-2013-hackathon-recap/</link>
		<comments>http://razor.occams.info/blog/2013/03/02/open-data-day-2013-hackathon-recap/#comments</comments>
		<pubDate>Sat, 02 Mar 2013 12:58:39 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=569</guid>
		<description><![CDATA[Last weekend in perhaps as many as 100 cities around the world open data enthusiasts held hackathons. Here in DC we too were celebrating February 23 as International Open Data Day. And it was, dare I say, a great success. Over 150 developers, data scientists, social entrepreneurs, government employees, and other open data enthusiasts participated [...]]]></description>
			<content:encoded><![CDATA[<p>Last weekend in perhaps as many as <a href="http://blog.okfn.org/2013/02/28/wrapping-up-open-data-day-2013-2/">100 cities around the world</a> open data enthusiasts held hackathons. Here in DC we too were celebrating February 23 as <a href="http://www.opendataday.org">International Open Data Day</a>. And it was, dare I say, a great success.</p>
<p>Over <strong>150</strong> developers, data scientists, social entrepreneurs, government employees, and other open data enthusiasts participated in our event, first at a kickoff Friday night at Google&#8217;s DC headquarters and then at the Saturday session at The World Bank. Participants worked on local DC issues, global open source mapping, world poverty, and open government. Here are some quick links:</p>
<blockquote><p><strong>Videos: </strong><a href="http://www.youtube.com/watch?v=e_LcBQuaM1s">One</a> | <a href="http://www.youtube.com/watch?v=Jm2QxXNLSaQ">Two</a> &#8211; <strong>Photos: </strong><a href="http://www.flickr.com/photos/katmandoo/sets/72157632877567408/">One</a> | <a href="http://www.flickr.com/photos/87925482@N08/sets/72157632889737965/">Two</a></p>
<p><a href="http://konklone.com/post/open-data-day-dc-2013">Eric&#8217;s Recap</a> | <a href="http://blogs.worldbank.org/opendata/open-data-community-hacks-dc-and-global-issues">Sam&#8217;s Recap</a> | <a href="http://opendatadaydc.tumblr.com/">Tumblr</a> | <a href="http://storify.com/worldbank/opendataday">Storified Tweets</a></p>
<p><strong>Press coverage is listed at the end.</strong></p></blockquote>
<p>Our approach to the hackathon was a little different than many others. Our goals were to strengthen the open data community, to foster connections between people and between projects, and to emphasize problem statements over prototypes and solutions. There was no beer or pizza at our hackathon, no competitions, and no pressure to produce outputs. Participants came motivated and stayed focused without needing to be treated like brogrammers. This created a positive, welcoming, and highly productive environment.</p>
<p>In the morning Eric Mill (Sunlight Foundation/<a href="https://twitter.com/konklone">@konklone</a>) ran a several-hours-long <strong>tutorial on open data</strong> for about 40 participants. Some were new to coding. Others were project managers (inside and outside of government) who wanted to learn more about what open data is all about from the ground up. Eric walked the participants through exploring APIs through the web browser and using command-line tools to process CSV files &#8212; a very concrete way to explain the benefits of adding structure to data.</p>
<p>Several projects focused on <strong>local DC issues</strong>: mapping <a href="http://bit.ly/13dCJhb">zoning restrictions</a> (<a href="http://bureauphile.wordpress.com/2013/02/24/open-data-day-versus-legal-codes/">more</a>), graphing public and charter <a href="http://i.imgur.com/5qxNdhg.jpg">school enrollment</a> and (<a href="http://imgur.com/SnnApCa">other education data</a>), mapping <a href="http://www.flickr.com/photos/gwirth/sets/72157632841459992/">trees</a> by species, and building a <a href="https://groups.google.com/group/districtcommons/subscribe">database of social service providers</a>.</p>
<p>A large team of map hackers worked on <strong>mapping Kathmandu</strong> in Open Street Map to aid disaster response, and with their collaborators around the world <a href="http://mapbox.com/blog/mapping-kathmandu-stats/">mapped over 7,000 building footprints</a>.</p>
<p><strong>Global poverty and international development</strong> was the focus of several other projects, from building APIs for international development project <a href="https://mcc.demo.socrata.com/dashboard/countries">performance data</a> to <a href="http://datakind.org/2013/02/datadive-fight-poverty-corruption-world-bank/">measuring poverty in real time</a> using Twitter.</p>
<p>The <strong>open government</strong> projects worked on adding <a href="http://namespaces.cato.org/catoxml/">semantic information</a> to legislative documents, comparing legislative documents for <a href="http://stephanis.info/tag/opendataday/">similarity</a>, <a href="https://github.com/dvogel/pacer-recap-citations">extracting legal citations</a>, cataloging our <a href="http://api.demofcracymap.org/#get-involved">government representatives</a> at the local level, and <a href="http://github.com/OpenDataDevOps/minus">building &#8220;devops&#8221; tools</a> for rapid deployment of VMs that might be useful in government or for open data researchers.</p>
<p>And there were other projects that don&#8217;t fit into any of those categories, like building Python tools for creating <strong>education curricula</strong>,</p>
<p>The event was organized by me (Josh Tauberer/GovTrack/<a href="https://twitter.com/JoshData">@JoshData</a>), Eric Mill (Sunlight Foundation/<a href="https://twitter.com/konklone">@konklone</a>), Katherine Townsend (USAID/<a href="https://twitter.com/DiploKat">@DiploKat</a>), Dmitry Kachaev (Presidential Innovation Fellow/Millennium Challenge Corporation/<a href="https://twitter.com/kachok">@kachok</a>), Sam Lee (The World Bank/<a href="https://twitter.com/OpenNotion">@OpenNotio</a>n), and Julia Bezgacheva (<a href="https://twitter.com/ulkins">@ulkins</a>/The World Bank).</p>
<p>Thanks to The World Bank especially, and to Google, the participants that helped out with registration in the morning, and to everyone who came!</p>
<p>This was DC&#8217;s second open data day. Our first was on Dec. 3, 2011 and was co-hosted by POPVOX (Josh Tauberer) and Wikimedia DC (Katie Filbert). See what we did on the post-event recap at <a href="https://www.popvox.com/features/opendataday2011">https://www.popvox.com/features/opendataday2011</a>. Participants then worked on improving access to U.S. law, scanning federal spending for anomalies following Benford’s Law, understanding farm subsidy grants, building local transit apps, and keeping Congress accountable. Only about half of the participants were programmers, buteveryone found a way to be involved.</p>
<p>It was also DC&#8217;s second international development data day. The last one was held on December 9, 2012 in the lead-up to the Development DataJam hosted by White House’s Office of Science &amp; Technology. Those events primarily served as ideation jams to bring together issue area experts and data experts to develop new ideas and partner for new solutions. Experts were sought out to inform the discussions, but anyone with an interest in open data in development were welcomed and participated.</p>
<p><em>Press coverage</em></p>
<p>DCist: <a href="http://dcist.com/2013/03/open_data.php">Hack D.C.: Hackers Put Open Data to Use to Help Improve Local Government</a></p>
<p>The Atlantic Cities: <a href="http://www.theatlanticcities.com/neighborhoods/2013/03/there-link-between-walkability-and-local-school-performance/4992/">Is There a Link Between Walkability and Local School Performance?</a></p>
<p>Greater Greater Washington: <a href="http://greatergreaterwashington.org/post/18052/how-school-tiers-match-up-with-walk-score/">How school tiers match up with Walk Score</a></p>
<p>Greater Greater Education: <a href="http://greatergreatereducation.org/post/17992/community-of-civic-hackers-for-education-takes-shape/">Community of civic hackers for education takes shape</a></p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2013/03/02/open-data-day-2013-hackathon-recap/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Would the real hacktivist please stand up?</title>
		<link>http://razor.occams.info/blog/2013/01/18/would-the-real-hacktivist-please-stand-up/</link>
		<comments>http://razor.occams.info/blog/2013/01/18/would-the-real-hacktivist-please-stand-up/#comments</comments>
		<pubDate>Fri, 18 Jan 2013 23:23:54 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=551</guid>
		<description><![CDATA[Professor Peter Ludlow wrote of &#8220;lexical warfare&#8221; over the term &#8220;hacktivist&#8221; in a recent New York Times blog post. Unfortunately the war that Ludlow observed has been long over for at least 10-20 years now, and what might have once been a reasonable analysis of the meaning of the word is today simply wrong. Ludlow&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>Professor Peter Ludlow wrote of &#8220;lexical warfare&#8221; over the term &#8220;hacktivist&#8221; in <a href="http://opinionator.blogs.nytimes.com/2013/01/13/what-is-a-hacktivist/">a recent New York Times blog post</a>. Unfortunately the war that Ludlow observed has been long over for at least 10-20 years now, and what might have once been a reasonable analysis of the meaning of the word is today simply wrong.</p>
<p><strong>Ludlow&#8217;s Position</strong></p>
<p>Ludlow depicts the war as tug-of-war between two ends of the spectrum. On the one end is what we generally call cyber crime, the sort of &#8220;hacking&#8221; portrayed in movies. The other end, in Ludlow&#8217;s description, is a &#8220;less sinister&#8221; and more generic activity. An example he gave is putting wool sweaters on trees, whatever that is. Ludlow also indicates that he believes this form of hacktivism has no &#8220;positive affect.&#8221; Ludlow&#8217;s analysis is fundamentally incorrect. There is no spectrum on which a war is occurring. And the other sort of hacktivism most certainly has a positive affect.</p>
<p><strong>The Meanings of Hack (n.)</strong></p>
<p>&#8220;Hack&#8221; has <span style="text-decoration: line-through;">two</span> at least three distinct meanings as a noun. It&#8217;s a homograph. Just like &#8220;mouse&#8221; and &#8220;keyboard&#8221; are (think rodents and pianos). A lot of jargon is like this. And &#8220;gay&#8221;. &#8220;Gay pride&#8221; is not an attempt to tug the definition of &#8220;gay&#8221; away from &#8220;happiness&#8221;. Maybe decades ago it was. It isn&#8217;t today. &#8220;Hack&#8221; is the same way. One meaning is more or less the same as cyber crime &#8212; that much Ludlow got right.</p>
<p>Another meaning is the sense of hack in &#8220;party hack&#8221; or &#8220;hack journalist.&#8221; (A hack journalist is someone who takes the side of whoever their employer is at the time.) There is no &#8220;hacktivist&#8221; in this sense, but this meaning demonstrates the plausibility of the argument that I&#8217;m making: that &#8220;hack&#8221; isn&#8217;t the object of lexical warfare but instead has multiple unrelated meanings. (Thanks to Neville Ryant for reminding me of this meaning.)</p>
<p><strong>The Good Sort of Hacking</strong></p>
<p>The last meaning of hack is hard to pin down, and I can&#8217;t claim to define it, but it&#8217;s roughly the perverting of something&#8217;s original purpose to solve a new problem. Rube Goldberg machines are hacks. The use of the lunar lander to bring the Apollo 13 crew home was a hack. Putting folded-up newspapers under table legs to stop tables from shaking is a hack. Hacks are often creative uses of technology. Hacks are usually applauded. They&#8217;re positive, creative, even artistic.</p>
<p>In my neck of the woods, &#8220;civic hacking&#8221; is a term for creative, often technological approaches to solving civic problems like how to get more people to register to vote or making beautiful city maps. It has nothing to do with crime. Sometimes it has nothing to do with computers. It&#8217;s about solving real world problems.</p>
<p>Hacking is by no means some sort of jargon specific to the tech-nerd culture either. There&#8217;s a website devoted to hacking IKEA products called <a href="http://www.ikeahackers.net/">IKEA Hackers</a>. Its creator defined hacking too:</p>
<blockquote><p>IkeaHackers.net is a site about modifications on and repurposing of Ikea products. Hacks, as we call it here, may be as simple as adding an embellishment, some others may require power tools and lots of ingenuity.</p></blockquote>
<p>An example is <a href="http://www.ikeahackers.net/2013/02/cuddle-elephant-to-costume.html">turning a pillow into a child&#8217;s Haloween custom</a>. (Cute, and of course not criminal!) From IKEA furniture to computer systems, there&#8217;s a shared hacker culture around repurposing, creativity, and solving problems.</p>
<p>If you&#8217;re a journalist writing about hacking or hacktivism, take a moment to think about which type of hacking you mean.</p>
<p><strong>Is It Warfare?</strong></p>
<p>So let&#8217;s compare now: cyber crime and solving problems.  This is not a natural spectrum. Not that there can&#8217;t be overlap. That&#8217;s how historically the words are related (to the best of my knowledge), like if you look back in the 1980s when the term was first coming into mainstream use. There&#8217;s a reason the two meanings shared a single word: Using technology for unintended reasons is often illegal. But it&#8217;s not because it&#8217;s hacking (in the positive sense of hacking) but because technology can do so much that it&#8217;s easy to run up against the boundary of the law. God forbid you use a copy machine (or ipod?) to copy something without permission! Stuff like that.</p>
<p>Is there a case of lexical warfare here? Ludlow defines what he means:</p>
<blockquote><p>“Lexical Warfare” is a phrase that I like to use for battles over how a term is to be understood.</p></blockquote>
<p>If lexical warfare is a battle over the single meaning of a term, that is not the case here. Civic hackers don&#8217;t particularly care that &#8220;hack&#8221; is used to refer to cyber crime. We lost that battle <span style="text-decoration: line-through;">before I was born</span> a long, long time ago. And cyber criminals don&#8217;t care about what civic hackers are up to, so far as I have seen.</p>
<p>There&#8217;s more evidence from how the verbs are used. You can &#8220;hack a server&#8221; (i.e. break in) and &#8220;hack the weather&#8221; (solve weather-related problems), but while one &#8220;hacks into&#8221; systems, one &#8220;hacks on&#8221; problems. &#8220;Hacking into voter registration&#8221; and &#8220;hacking on voter registration&#8221; mean something different. The choice of preposition (&#8220;into&#8221;/&#8221;on&#8221;) depends on which type of hack you mean, and it is evidence that the two meanings are distinct. (As a verb, by the way, &#8220;hack&#8221; has even more meanings, some totally unrelated to any of the meanings of the noun so far. Related to the problem solving meaning, some use &#8220;hack&#8221; to mean the same as to do computer programming.)</p>
<div>&#8220;Hack&#8221; is a case of peaceful coexistence. Problems only arise when reporters confuse the two groups. They misunderstand how the word is being used. Journalists, not only should you be clear to yourselves about which &#8220;hack&#8221; you mean, but also be clear in your writing. Us hackers &#8212; the civic hackers and others like us &#8212; don&#8217;t want to be indicted for other people&#8217;s crimes. &#8220;Criminal hacking&#8221; and &#8220;problem-solving hacking&#8221; might be a good way to be clear in writing.</div>
<p><strong>Derivations</strong></p>
<p>The words &#8220;hacker,&#8221; &#8220;hacktivist,&#8221; and &#8220;hacktivism&#8221; all share the same ambiguity that derives from the meanings of &#8220;hack.&#8221;</p>
<p>A &#8220;criminal hacktivist&#8221; is roughly someone who does &#8221;criminal hacking,&#8221; like denial of service attacks, for political purposes.</p>
<p>A &#8220;problem-solving hacktivist&#8221; is roughly someone who builds websites to motivate the public toward a public policy goal. The IT guys at nonprofits are problem-solving hacktivists (among many other groups of people).</p>
<p>In one of the articles Ludlow cites, <a href="http://www.infosecurity-magazine.com/blog/2012/6/7/hacktivism-shades-of-gray-/559.aspx">the one in Infosecurity Magazine</a>, hacktivism is said to be defined by Wikipedia as &#8220;the use of legal and/or illegal digital tools in pursuit of political ends.&#8221; This conflates the two meanings into one. This definition incorrectly includes anyone who emails their representatives in government, for instance. Such an action is not hacktivism because it is neither criminal hacking nor a creative or technological solution to a problem.</p>
<p>A &#8220;hackathon&#8221; &#8212; a hacking marathon &#8212; for the problem-solving type is when a bunch of optimistic people gather in a room and try to solve some problems.  Often with computer code. Often open-source and for the public good. Not always.</p>
<p><em>If you don&#8217;t know me, I&#8217;m a civic hacker and I&#8217;ve got a degree in linguistics. The title of this post of course refers to the famous Eminem song.</em></p>
<p><em>On 1/19/2013 I updated the post to include a third meaning of hack, &#8220;party hack.&#8221; Thanks Neville. On 1/20/2013 I added an the examples &#8220;hacking into&#8221; and &#8220;hacking on&#8221; and discredited the Wikipedia definition. On 2/28/2013 I added the section on IKEA Hackers.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2013/01/18/would-the-real-hacktivist-please-stand-up/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>On the new bulk bill XML from GPO</title>
		<link>http://razor.occams.info/blog/2013/01/10/on-the-new-bulk-bill-xml-from-gpo/</link>
		<comments>http://razor.occams.info/blog/2013/01/10/on-the-new-bulk-bill-xml-from-gpo/#comments</comments>
		<pubDate>Thu, 10 Jan 2013 18:55:23 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=545</guid>
		<description><![CDATA[The following is my reaction to today&#8217;s announcement from the Speaker on the availability of bill XML in bulk from the Government Printing Office. It&#8217;s adapted from the email I sent to Nick Judd for his article on the data. The part about institutionalizing transparency was really Daniel Schuman&#8217;s idea &#8212; sorry I didn&#8217;t attribute [...]]]></description>
			<content:encoded><![CDATA[<p>The following is my reaction to today&#8217;s <a href="http://www.speaker.gov/press-release/speaker-boehner-leader-cantor-praise-gpo-providing-bulk-access-house-bills-xml">announcement from the Speaker</a> on the availability of bill XML in bulk from the Government Printing Office. It&#8217;s adapted from the email I sent to Nick Judd for <a href="http://techpresident.com/news/23355/house-republicans-release-more-data-catnip-developers">his article</a> on the data. The part about institutionalizing transparency was really Daniel Schuman&#8217;s idea &#8212; sorry I didn&#8217;t attribute that! [Update: Also see <a href="http://oreillyradar.tumblr.com/post/40186606375/open-data-of-u-s-house-legislation-now-available-in">Alex Howard's article</a>.]</p>
<p>What we&#8217;re seeing with the bills bulk data project is how the wave of culture change is moving through government. Over the last two years the House Republican leadership has embraced open government in many ways (<a href="http://razor.occams.info/blog/2013/01/04/transparency-in-the-112th-house/">my 112th Congress recap</a> | the new <a href="http://www.speaker.gov/general/opengov-xml-house-floor-summaries-now-available-bulk">House floor feed</a>). With this bills XML project, we&#8217;re seeing more legislative support agencies being involved in how the House does open government.</p>
<p>This isn&#8217;t a technical feat by any means, but it is a cultural feat. The House and GPO worked together to institutionalize a new way for the House to publish bulk data.</p>
<p>Because of the way Data.gov is managed in the executive branch, we&#8217;ve become accustomed to big announcements. The bills bulk data project and the other recent projects show that the House is taking a different approach, an incremental approach, to open government data: publish early and often, gather feedback, then go on to bigger projects. This is something open government advocates have been asking for.</p>
<p>As I mentioned, the tech side itself is not much. They took files they and the Library of Congress already make available (and in some sense already in bulk) and zipped them up into up to 16 ZIP files. (4 files now, but that will probably grow to 16 by the end of the Congress.) So there&#8217;s no new data here, and thus not the data that the bulk legislative data advocates have been asking for. But it&#8217;s on the road to that. The files involved in this project have the text of legislation but not bill status, which is what the bulk data advocates have been asking for.</p>
<p>There is one thing crucial missing from this, and that&#8217;s that there is no feedback loop with the users of this data. The incremental approach can&#8217;t work unless the users of the data have a way to tell GPO what is and is not working. There is no public point of contact for these files, and I don&#8217;t even know a private point of contact at GPO.</p>
<p>But that doesn&#8217;t detract from the fact that this is a good step forward.</p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2013/01/10/on-the-new-bulk-bill-xml-from-gpo/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Transparency in the 112th House</title>
		<link>http://razor.occams.info/blog/2013/01/04/transparency-in-the-112th-house/</link>
		<comments>http://razor.occams.info/blog/2013/01/04/transparency-in-the-112th-house/#comments</comments>
		<pubDate>Fri, 04 Jan 2013 21:21:15 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=539</guid>
		<description><![CDATA[The House Republican Leadership over the past two years really surprised me. When the open gov tech community coalesced at the start of the 107th Congress in 2007 Democrats had just regained control of Congress after a series of ethics scandals in 2006 brought the Republican Party&#8217;s commitment to ethics into question. But despite Speaker [...]]]></description>
			<content:encoded><![CDATA[<p>The House Republican Leadership over the past two years really surprised me.</p>
<p>When the open gov tech community coalesced at the start of the 107th Congress in 2007 Democrats had just regained control of Congress after a series of ethics scandals in 2006 brought the Republican Party&#8217;s commitment to ethics into question. But despite Speaker Pelosi&#8217;s call for transparency at the start of the Democrat&#8217;s control, honestly very little happened over the following four years (the launch of HouseLive.gov and the availability of disbursements PDFs come to mind).</p>
<p>In fact, when calls for transparency persisted in the House &#8212; that is, Republicans asking Democrats for more transparency &#8212; we would often chalk that up to transparency being used by the minority party as a delay tactic.</p>
<p>But when the Republicans took over in 2011, they kept at it. With mixed success, of course. Some promises, like 72-hour delays before votes, were not taken even remotely seriously. But that shouldn&#8217;t detract from what they got right:</p>
<ul>
<li>They began a moratorium on earmarks, which was somewhat successful.</li>
<li>They launched <a href="http://docs.house.gov">Docs.House.gov</a>, which gave the public a heads-up about what would be happening on the floor up to a week in advance. Prior to Docs.House.Gov, (UPDATED) <span style="text-decoration: line-through;">there was essentially no advanced notice whatsoever</span> there was no structured data about the House calendar. (Thanks to Eric Mill for correcting my apparent exaggeration.)</li>
<li>They <a href="https://www.facebook.com/notes/facebook-washington-dc/in-hack-we-trust-our-first-congressional-facebook-hackathon/10150440114624455">held a &#8220;hackathon&#8221;</a> in December 2011, during which transparency and technology activists in the public had a chance to talk with House staff and get to understand the complexities of the House better.</li>
<li>They held a <a href="http://cha.house.gov/about/contact-us/legislative-data-conference">legislative data and transparency conference</a> in February 2012, the first conference of its kind.</li>
<li>They promised legislative data, and <a href="http://www.govtrack.us/blog/2012/06/07/rep-crenshaw-backs-down-loses-control-over-bulk-data-issue/">after public outcry</a> they formed a task force to consider it. (On the downside, we had to have an outcry.)</li>
<li>They centralized committee video webcasting and archiving infrastructure, leading to much more of committee proceedings being available over the web.</li>
<li>At the very end of the 112th Congress they <a href="http://cha.house.gov/sites/republicans.cha.house.gov/files/documents/House%20Committee%20Documents%20Letter%20-%2012%2020%2012%20%282%29.pdf">made any committee documents sent to GPO available electronically by default</a> (update: link posted)</li>
<li>The Clerk&#8217;s <a href="http://clerk.house.gov/">official list of members</a> got a new column of bioguide IDs.</li>
<li>They began the creation of data standards for committees which lead to significant updates on Docs.House.Gov on the first day of the 113th Congress.</li>
<li>(UPDATE) They passed the <a href="http://www.govtrack.us/congress/bills/112/hr2146">DATA Act</a>.</li>
</ul>
<div>That said, all I ever wanted was bulk data on the status of legislation and I haven&#8217;t gotten that. Maybe this year?</div>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2013/01/04/transparency-in-the-112th-house/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>User Experience at “Tunnel Creek”: What we can all learn from The New York Times&#8217;s Snow Fall piece</title>
		<link>http://razor.occams.info/blog/2012/12/23/user-experience-at-%e2%80%9ctunnel-creek%e2%80%9d-what-we-can-all-learn-from-the-new-york-timess-snow-fall-piece/</link>
		<comments>http://razor.occams.info/blog/2012/12/23/user-experience-at-%e2%80%9ctunnel-creek%e2%80%9d-what-we-can-all-learn-from-the-new-york-timess-snow-fall-piece/#comments</comments>
		<pubDate>Sun, 23 Dec 2012 18:12:41 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Code]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=534</guid>
		<description><![CDATA[Just a few paragraphs in to The New York Times&#8217;s six-part Snow Fall series, I was captivated equally by the story and by the innovative magazine-style in which the story was presented. So I began taking notes about the user experience of reading Snow Fall, knowing there would be a lot to learn for other [...]]]></description>
			<content:encoded><![CDATA[<p>Just a few paragraphs in to The New York Times&#8217;s six-part <a href="http://www.nytimes.com/projects/2012/snow-fall/#/?part=tunnel-creek">Snow Fall</a> series, I was captivated equally by the story and by the innovative magazine-style in which the story was presented. So I began taking notes about the user experience of reading Snow Fall, knowing there would be a lot to learn for other user interface projects.</p>
<p><strong><span id="more-534"></span></strong></p>
<h3>The Sidebars: Style and Media Types</h3>
<p>The sidebars &#8212; the extra media besides the text of the story &#8212; were the focus of most of the innovation in the presentation of this story. I say “sidebar” loosely, since some of the sidebars were more like between-bars or beneath-bars. Here&#8217;s more on that:</p>
<p>The sidebars were <strong>limited to a few distinct styles and media types</strong>, which created a sense of continuity throughout the piece. Sidebars were positioned in just five ways: full-wide interrupting the text, inside the text column floated right, the same but floated left, inside the right-column, and roughly 3/4ths-wide cutting into the text column and extending to the right edge of the window. (During the ski map feature in part three, all bets were off on sidebar style.)</p>
<p>There were <strong>seven media types</strong>: full-wide videos, video mini-features (in the text column, or 3/4ths wide), photos (full wide montage and 3/4ths wide singles and pairs), slideshows (3/4ths wide), mug shots (inside the right column), auto-play mini-videos (inside the right column), and audio-only clips (inside the text column but floated left). Styles were highly consistent, although there was some variation in the placement of photo captions. <strong>Each media type brings its own short learning curve</strong>: how to start a movie, what does a slideshow do, why doesn&#8217;t this movie have a restart button, etc. If anything, I might have dropped the two media types that had the least value as distinct media types: slideshows and the auto-play mini-videos, both of which could have been done in the format of video mini-features.</p>
<p>The types of media besides the body text were <strong>introduced gradually</strong>, both within parts and across parts. This again helped with a sense of continuity, or at least limited the amount of surprise, confusion, and sense of visual clutter you feel when you encounter a new type of media. In the first part of the series, five types of sidebar media are used but half of the types appear for the first time only in the second half of the page. Media types mostly repeat between a new type and a type the user has seen before. The second part in the series introduces the full-wide photo montage toward the middle and the auto-play mini-video near the end (more on that later), but otherwise sticks to the same types of media found in the first part. Part five added audio-only clips.</p>
<p>Most <strong>full-wide sidebars actually extended to the width of the browser</strong>, beyond the main 1024px wide body area that other sidebars are confined to, if your browser happens to be open wider. This near-full-screen effect adds to the <strong>incredibility of the splash videos</strong> and photos especially. At first I thought the fixed-positioned ski map in part three was a right-column sidebar (more on that below), but when I maximized my browser it turned out to have more in common with full-wide sidebars. It fills the browser width just like full-wide sidebars, and has a fixed-positioned effect like the intro title. But unlike the others, the now 1/4-wide text column, which appears beside it at low browser window widths, actually runs over the sidebar at large browser window widths. The text has a solid white background (i.e. it&#8217;s not a transparent overlay) and it <strong>appears not so unlike the ski paths that carve through the mountain</strong> on the map beneath it.</p>
<p>The <strong>mug shot</strong> sidebars are notably low-key. The photos are tiny (55&#215;55) and are next to usually just two lines of text, the person&#8217;s age and occupation. But on hover, the whole thing becomes a link to a slideshow for more images of that person. What surprised me most was that the mug shots in the first part of the series were not placed next to the first introduction of a person in the story. Instead they were <strong>grouped together</strong> toward the end of the page. In later parts, mug shots were more often next to first mentions.</p>
<p>Sidebars videos and mug shots (but not other types of sidebar content) all have <strong>an anchor in the text</strong>. For mug shots the anchor is the person&#8217;s name. For videos, a few words or a sentence near the video were the anchor. The anchor text had a grey background and a video or slideshow icon at their left side. This suggests to the reader this is a good stopping point to try the video, or if not there then maybe at the end of the paragraph. Hovering over the highlighted text causes the play button on the video to get bigger, letting you know what is going to happen. Clicking the text starts the video.</p>
<p>Full-wide videos and what I called the auto-play mini-videos started automatically, which I found annoying. (In the second part of the series the mini-video was an animation of how some avalanche protection gear worked.) But while the full-wide videos had start-over buttons, the mini-videos were an attempt at simplicity. The mini-video in part two had no border and a white background and no play control, so it blended well into the background, but the “way” to play it again if you missed it the first time was guess that you could scroll up and then scroll back down to it.</p>
<h3>The Ski Map Sidebar and the Avalanche Video</h3>
<p>Then there is the fixed-positioned <strong>ski map in part three</strong>, part sidebar part interactive graphic. It slides up as a 3/4ths-wide sidebar (or so I thought) but remains fixed in place for the remainder of part three. It forces the text to its left to stay at 1/4th-width (more on that above). <strong>The map evolves as you scroll down</strong>, with different ski paths highlighted every few paragraphs according to what is happening in the story. Mug shots appear on top of the map as you scroll, to indicate that the ski paths currently highlighted on the map were the paths those individuals took. But oddly while the map was fixed, the mug shots layered above it continued to scroll with the body text. It was an interesting (but perhaps not entirely successful) effect. While the ski map was up on the right, video mini-features appeared in a smaller size wholly inside the small text column.</p>
<p>I don&#8217;t have much to say about most of the videos, but the <strong>Tunnel Creek Avalanche: In Real Time</strong> video deserves a few notes. This was a video similar in design to the 3D fly-over of the mountain from part one. It&#8217;s a computer generated view of the mountainside, sort of photo-realistic, and from what I read constructed from scratch using elevation data and satellite imagery. The video follows the path of the avalanche, which is represented by a sort of highlight over the image.</p>
<p><strong>It&#8217;s also a multimedia diagram.</strong> Contour lines cover the surface of the mountainside: thick lines every 2,000 ft of elevation (some but not all labeled), thin lines every 250 ft. And, incredibly, a <strong>synthesized drum beat</strong> counts off the feet, a light tap every 125 ft, a strong tap on the 2,000 ft lines, giving a multi-modal and actually emotional experience of the change in speed of the avalanche. The audio track also has the sound of wind. White dots highlight the locations of skiers (before the avalanche starts, <strong>tastefully</strong> not how they were carried away by the avalanche).</p>
<h3>Focus and Color</h3>
<p>There&#8217;s a splash screen at the start of each part in the series. And it&#8217;s huge. Depending on your window size, there <strong>may not be any body text visible</strong> at all before you scroll down, which I usually think is a design problem but didn&#8217;t mind here. At some window sizes, the page would <strong>scroll down gently automatically</strong> to reveal the first two lines of body text. The splash screen is a beautiful movie of snow and clouds at the start of four parts in the series (including of course the first part), and images with some text or CSS animation in the second and third parts.</p>
<p>On the main landing page for the series, part one, navigation to other parts in the series at the top <strong>doesn&#8217;t appear</strong> until you scroll past the splash movie. (And once it appears it never goes away.) It keeps the focus on the series title and movie.</p>
<p>As you scroll past the splash movie and it slides away (getting covered up from below), the title and byline begin to <strong>fade out</strong> &#8212; as if to force you to look away toward the body text.</p>
<p>As you scroll into the Tunnel Creek 3D fly-over video in part one, and similarly the Tunnel Creek Avalanche video in part four, in the area above the video the <strong>background color fades to match</strong> the background color at the top of the video, so that by the time you are scrolled to the video the page feels coherent as a single presentation and not as body content plus a sidebar.</p>
<p>Sidebar videos inside the text column <strong>start greyed-out with no text or play controls</strong>. Once it reaches a certain height on the page, it fades in to full color and the title of the sidebar and the play button fade in.</p>
<p>The bottom navigation to move on to the previous or next part in the series only appears when you have scrolled to almost the very bottom, which is similar to how regular NYT articles nag you to read more when you get to the bottom. But it did not feel like a nag here.</p>
<p>As you scroll past the splash movie and some other full-wide videos, the media stays fixed while the content below covers it up from below. For the Tunnel Creek video, I found this confusing. Because it would scroll up when you&#8217;re going into it, but it does not scroll away.</p>
<h3>Layout, Structure, and Font</h3>
<p>By and large, the body area was confined to 1024px, centered horizontally at larger window sizes, similar to the NYT site as a whole. But as I noted, some full-width sidebars extended horizontally to the size of the browser window (and preserved aspect ratio, so got taller too).</p>
<p>There are two levels of structure to the piece, the six parts in the series with navigation across the top and &lt;h3&gt;&#8217;s throughout the body text.</p>
<p>The body text size is 15px (22px/147% line height), which is the same as other NYT articles. The width of the column is 626px, which is larger than the 600px in typical NYT stories. I changed the font on my blog to match!</p>
<p><strong>In the first few screenfuls of the body, there is only plain text.</strong> No links. No ads. Nothing in the right column. The only bit of fancy is the enlarged first letter of the article.</p>
<p>At the end, the story offers itself to be viewed as an ebook or to be watched in documentary form. And it concludes with half a screenful of credits.</p>
<h3>Advertisements</h3>
<p>Much of the success of the layout in this series comes from the limited advertising. There is one advertisement per section, I think. For comparison, there were about 10-15 sidebars per section. That isn&#8217;t necessarily a sustainable practice for a newspaper.</p>
<p>The first advertisement is 3/4ths the way down the first part in the series, and similarly positioned in some of the other parts. Advertisements all had the same distinct style that could not be confused with sidebar media: full-wide, labeled Advertisement really clearly, and separated from the editorial content with a black background (#333333) and large padding.</p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2012/12/23/user-experience-at-%e2%80%9ctunnel-creek%e2%80%9d-what-we-can-all-learn-from-the-new-york-timess-snow-fall-piece/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Comments on the draft Declaration on Parliamentary Openness, Sec. 44: Facilitating Two-Way Communication</title>
		<link>http://razor.occams.info/blog/2012/11/14/comments-on-the-draft-declaration-on-parliamentary-openness-sec-44-facilitating-two-way-communication/</link>
		<comments>http://razor.occams.info/blog/2012/11/14/comments-on-the-draft-declaration-on-parliamentary-openness-sec-44-facilitating-two-way-communication/#comments</comments>
		<pubDate>Wed, 14 Nov 2012 18:22:17 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=516</guid>
		<description><![CDATA[The following is a comment on the draft Declaration on Parliamentary Openness, Sec. 44: Facilitating Two-Way Communication, which begins: &#8220;Parliament shall endeavor to use interactive technology tools to foster the ability of citizens to provide input on legislation and parliamentary activity, and to communicate with members or parliamentary staff.&#8221; Parliaments, like our Congress here, never [...]]]></description>
			<content:encoded><![CDATA[<p><em>The following is a comment on the draft <a href="http://publicmarkup.org/bill/opening-parliament-declaration/5/">Declaration on Parliamentary Openness</a>, Sec. 44: Facilitating Two-Way Communication, which begins: &#8220;Parliament shall endeavor to use interactive technology tools to foster the ability of citizens to provide input on legislation and parliamentary activity, and to communicate with members or parliamentary staff.&#8221;<br />
</em></p>
<p>Parliaments, like our Congress here, never seem to listen to citizens as much as we the citizens would like. One of the largest problems in public trust is the perception that Congress only listens to moneyed lobbyists. But a recommendation to <em>listen more</em> mostly misses the reasons for the lack of sufficient communication, and so is not likely to be an effective remedy.</p>
<p>There are two very different types of input the public can provide to parliaments. One is sentiment, the other is expertise. When a member of parliament votes against a popular position in his or her district, it is a problem of sentiment. When a member of parliament introduces a flawed bill, it is a problem of expertise.</p>
<p>In my experience here, Members of Congress are eagerly interested in knowing the sentiment in their district. That&#8217;s because they don&#8217;t want to be voted out next election. Congressional offices here each have staffs dedicated solely to processing incoming mail. So what&#8217;s the problem? Well, representatives here have constituencies of <a href="http://factfinder2.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=DEC_10_SF1_GCTPH1.US04PR&amp;prodType=table">up to</a> 1 million individuals (that&#8217;s Nevada&#8217;s 3rd congressional district; the average size of a district is 710,000) and senators up to 37 million (California). And a lot of those individuals write their representative, which means Congress gets an enormous amount of mail. They couldn&#8217;t possibly read it all.</p>
<p>Asking members of parliaments to engage in more forms of communication isn&#8217;t helpful. They&#8217;re reading as much as their budgets will allow them to hire staff to do that. A better ask could be: allocate more funding to processing mail. But even better is: develop innovative ways to handle communication more <em>efficiently</em> based on whatever particular factors are at work in your parliament. Efficiently means processing mail faster, for instance by making sure citizens submit messages that are shorter, clearer, aggregated, or otherwise easier to process in an automated way. This was a significant component of our approach at <a href="http://www.popvox.com">POPVOX</a>, a platform for constituent communication that I co-founded several years ago. Yes, that means in some cases boiling messages down to Support or Oppose and  having <em>less actual reading</em> of constituent messages by humans (more by computers).</p>
<p>Of the dozen forms of constituent communication recommended by the Declaration, the vast majority make constituent communication <em>less efficient,</em> and therefore less effective. Email and Facebook, for instance, are significantly less efficient than an online poll.</p>
<p>That said, there <em>are</em> times when members of parliament might find it convenient to ignore the sentiment they are hearing. In those cases, it&#8217;s important for the constituent to have the option to make his or her sentiment public (but typically anonymously). You might want to have your parliament handle the openness part, or you might want to ask your parliament to  support the technical infrastructure so a third party can handle the openness part (as in our case at POPVOX).</p>
<p>The second type of communication &#8212; expertise &#8212; is entirely different. Expertise needs to be communicated in a different channel from sentiment. The reason is that while everyone thinks their own expertise is important, it&#8217;s just not true. The reason lobbyists have so much influence is a combination of several factors related to expertise, some of which are: they are experts on a particular subject, they are experts on the current law and the  legislative process and thus can communicate effectively with legislative staff, and they are willing to do whatever additional research the legislator needs to make a decision. Citizens don&#8217;t have all three. If they did, they&#8217;d be lobbyists.</p>
<p>Better communication of citizen expertise is a conundrum.  While we know what efficient communication of sentiment looks like (e.g. online polls, petitions, or what we did at POPVOX), there are not yet any proven methods of constituent communication of expertise. Some examples to draw from, however, are the <a href="http://opengovfoundation.org/madison-in-the-news/">MADISON</a> project (collaborative editing of bills run out of the office of a congressional committee here) and especially <a href="http://peertopatent.org/">Peer to Patent</a>.</p>
<p>I want to emphasize why it is a conundrum, to be clear. Obviously we want the expertise that legislators have access to to be correct, comprehensive, unbiased, etcetera. But those aren&#8217;t the only factors legislators have to consider. The information has to be readily available, it has to be from a source they trust, and it has to be in language they understand. The communication of expertise is therefore <em>hurt</em> by widening the base of people who can participate if there is no mechanism for determining who can be trusted, who communicates well, and so on.</p>
<p>My suggestion for the Declaration would be not to recommend simply greater engagement, but instead to recommend implementing a two-track system of taking constituent input. One track would be optimized for sentiment: it would encourage concise, easy-to-process messaging with publicly displayed aggregate totals and an opt-in to make messages public. The other track would be optimized for expertise and would focus on establishing trust with citizens who are willing to put in the time to navigate public policy and the legislative process.</p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2012/11/14/comments-on-the-draft-declaration-on-parliamentary-openness-sec-44-facilitating-two-way-communication/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The mythical open data divide</title>
		<link>http://razor.occams.info/blog/2012/05/14/the-mythical-open-data-divide/</link>
		<comments>http://razor.occams.info/blog/2012/05/14/the-mythical-open-data-divide/#comments</comments>
		<pubDate>Mon, 14 May 2012 20:42:53 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=511</guid>
		<description><![CDATA[There&#8217;s been some interesting negative reactions to open data lately. I&#8217;m all for skepticism, but skepticism should be backed up with facts. For instance, when Michael Gurstein talks about the digital divide he refers to concrete examples of this coming true &#8212; although I disagree with some of his analysis (more on that in my [...]]]></description>
			<content:encoded><![CDATA[<p>There&#8217;s been some interesting negative reactions to open data lately. I&#8217;m all for skepticism, but skepticism should be backed up with facts. For instance, when Michael Gurstein talks about the digital divide he refers to concrete examples of this coming true &#8212; although I disagree with some of his analysis (more on that <a href="http://opengovdata.io/2012-02/page/6-1/unintended-consequences-and-the-limits-transparency">in my book</a>). <a href="http://blogs.gartner.com/andrea_dimaio/2012/05/14/open-data-and-the-new-divide/">Andrea Di Maio&#8217;s critique</a> of open government data, on the other hand, is too vague to be instructive.</p>
<p>I readily admit that &#8220;open government&#8221; and &#8220;open data&#8221; is vague already. Justin Grimes, Harlan Yu, and I held a panel at Transparency Camp a few weeks back about how vague the terms are and how that can lead to trouble when we aren&#8217;t clear about what we mean. Di Maio runs right into this problem, placing the burden of a successful open data movement on &#8220;mythical &#8216;application developers&#8217;&#8221; (sorry, I don&#8217;t exist?). Transparency is only one of a dozen or more reasons why open government data is a good thing, and these reasons cannot all be judged by the same rubric.</p>
<p>Focusing on open data for transparency, Di Maio argues that &#8220;[t]he more the data, . . . the more specialized are the skills and resources required to process that data,&#8221; or in other words that open data can actually exacerbate a digital divide rather than close it. I&#8217;ll be one of the first to say that open data doesn&#8217;t necessarily mean better government (see the link to my book above), but Di Maio&#8217;s statement that I quoted can easily seen to be simply wrong.</p>
<p>One has to evaluate open gov data with everything else held equal. in other words, if there is open government data in the hypothetical, the comparison to make is to another world where the same government processes exist &#8212; they&#8217;re just not published in a machine-processable format. Now, you tell me, which world requires more skill to understand the data? Clearly the second, because every skill you need in the open-data-enabled world you need in the open-data-denied world, but you *also* need some other skills just to get the information in the second world.</p>
<p>There are certainly unintended consequences of open data. In my book I discuss two cases where legislative data affected the behaviors of the legislators in a way that&#8217;s probably not good. But let&#8217;s stay on fact and, for that matter, on logic.</p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2012/05/14/the-mythical-open-data-divide/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
