<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Joshua Tauberer's Blog &#187; Civic Hacking</title>
	<atom:link href="http://razor.occams.info/blog/category/civichacking/feed/" rel="self" type="application/rss+xml" />
	<link>http://razor.occams.info/blog</link>
	<description></description>
	<lastBuildDate>Mon, 18 Apr 2011 12:16:21 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.5</generator>
		<item>
		<title>The status of policy implementations of Open Government Data</title>
		<link>http://razor.occams.info/blog/2011/04/18/the-status-of-policy-implementations-of-open-government-data/</link>
		<comments>http://razor.occams.info/blog/2011/04/18/the-status-of-policy-implementations-of-open-government-data/#comments</comments>
		<pubDate>Mon, 18 Apr 2011 12:08:21 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>
		<category><![CDATA[Open House/Senate Projects]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=500</guid>
		<description><![CDATA[On the Open House Project mail list, Gregory Slater asked: Once again, where are we in simple mere machine readable, standardized format, truly searchable data Things are getting better all the time. It&#8217;s not fast. But there is slow and steady progress. 2009 was a big year. We saw the Senate start publishing votes in XML, the launch [...]]]></description>
			<content:encoded><![CDATA[<p>On the Open House Project mail list, Gregory Slater asked:</p>
<blockquote><p>Once again, where are we in simple mere machine readable, standardized format, truly searchable data</p></blockquote>
<p>Things are getting better all the time. It&#8217;s not fast. But there is slow and steady progress.</p>
<p>2009 was a big year. We saw the <a href="http://www.cjr.org/campaign_desk/senate_goes_xml.php">Senate start publishing votes in XML</a>, the launch of data.gov and the Open Government Directive, and the GPO released <a href="http://www.washingtonpost.com/wp-dyn/content/article/2009/10/04/AR2009100402533.html?hpid=sec-politics">XML for the Federal Register</a> and <a href="http://www.gpo.gov/pdfs/news-media/PrepRe_072709.pdf">Code of Federal Regulations</a>. The House also began publishing its <a href="http://disbursements.house.gov/">spending data</a> electronically. There were also <a href="http://www.opengovdata.org/home/legislation">open standards laws passed in Vancouver and Portland</a>.</p>
<p>2010 was a big year for posturing. We saw introduced in Congress H.R. 4983: Transparency in Government Act of 2010 (Quigley), H.R. 6289: To  direct the Librarian of Congress to make available to the public the bulk legislative&#8230; (Foster), and H.R. 4858: <a href="http://sunlightfoundation.com/policy/poia/">The Public Online Information Act of 2010</a>. The Congressional Transparency Caucus was created (Quigley/Issa). An open data law was passed in <a href="http://www.opengovdata.org/home/legislation">San Francisco, and bills were introduced in New York City and New York State</a>.</p>
<p>This year, an open data bill was <a href="http://www.opengovdata.org/home/legislation">introduced in New Hampshire</a> (HB 310-FN), and POIA was reintroduced (Israel/Tester). I just noticed that GPO has last month added a bulk data download for <a href="http://www.gpo.gov/fdsys/browse/collectiontab.action">Public Papers of the Presidents of the United States (2009)</a>. The <a href="http://sunlightfoundation.com/blog/2011/01/05/new-transparency-in-the-new-house-rules/">new House rules package</a> addresses public access to committee records and data formats, though I am not aware if the rules have had any practical consequences.</p>
<p>We owe Sunlight a lot of credit for pushing many of these things forward.</p>
<p>Besides all of this, there have been a number of &#8220;contests&#8221; lately (<a href="http://challenge.gov/">http://challenge.gov/</a>) some offering prizes to use government data. Clay Johnson posted two new ones to the sunlightlabs mail list this week. The only thing I&#8217;ll say here about the strategy of the open government movement is that we haven&#8217;t taken the challenges seriously, and I think it&#8217;s a missed opportunity to show why open data matters. But it&#8217;s just one of many things to do.</p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2011/04/18/the-status-of-policy-implementations-of-open-government-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Part II: We all know money is a corrupting force, right?</title>
		<link>http://razor.occams.info/blog/2010/12/10/part-ii-we-all-know-money-is-a-corrupting-force-right/</link>
		<comments>http://razor.occams.info/blog/2010/12/10/part-ii-we-all-know-money-is-a-corrupting-force-right/#comments</comments>
		<pubDate>Sat, 11 Dec 2010 03:07:36 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=467</guid>
		<description><![CDATA[(My last post began a long discussion on the OHP list. Here&#8217;s some of my follow-up.) I don&#8217;t believe any of my transparency colleagues believe that there is a pervasive systematic problem with Hill staff &#038; Members making literally corrupt decisions on a regular basis. However, I&#8217;m sure everyone here recognizes systemic selection biases (who [...]]]></description>
			<content:encoded><![CDATA[<p>(My last post began a long discussion on the OHP list. Here&#8217;s some of my follow-up.)</p>
<p>I don&#8217;t believe any of my transparency colleagues believe that there is a pervasive systematic problem with Hill staff &#038; Members making literally corrupt decisions on a regular basis. However, I&#8217;m sure everyone here recognizes systemic selection biases (who can afford to get elected) and incentives (the revolving door) that are worthy of study.</p>
<p>The net effect of these biases and incentives is not clear, but there is certainly an effect.  We know there have been a few bad apples and it is the public&#8217;s duty to be on the lookout for more. And we know that the biases and incentives affect policy results i.e. through who is elected, how committees are assigned, what lobbyists/advocates have access to Congress&#8217;s ears, and maybe in some more pernicious ways.</p>
<p>Now whether the net effect is in some sense good, neutral, or bad is something we&#8217;re disagreeing on. Tom is basically defending neutral, while most else would say bad.</p>
<p>Compared to what, and how would you know?</p>
<p>The problem with this discussion is that we can&#8217;t make up a hypothetical less-money-obsessed world world that we would all agree on. Take away money and some other aspect of the human condition is going to take its place. And even if we could imagine a world, how would you measure if that world was better off?</p>
<p>I&#8217;ll try to tie this back into the point I initially made:</p>
<p>Discovering bad applies through investigative, data-driven reporting is great for the country. It is actionable information. But while reporting on mere &#8220;correlation&#8221; establishes a *possible* bias or incentive, it neither indicates an actual effect on policy nor suggests any action that we could take that we could be reasonably sure would in fact improve policymaking.</p>
<p>Of course, correlations can be the beginning of an investigative project. Paul Blumenthal&#8217;s recent post &#8220;<a href="http://blog.sunlightfoundation.com/2010/12/09/incoming-finance-committee-chairman-relies-on-finance-campaign-contributions/">Incoming finance committee chairman relies on finance campaign contributions</a>&#8221; raises a lot of concern over correlations. But its relevance is backed up by other observations and makes a good case that, at the very least, the media should be keeping a close eye on a Member of Congress who is being tempted by some very strong incentives. Hopefully he&#8217;ll resist the temptations. </p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2010/12/10/part-ii-we-all-know-money-is-a-corrupting-force-right/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>We all know money is a corrupting force, right?</title>
		<link>http://razor.occams.info/blog/2010/12/08/money/</link>
		<comments>http://razor.occams.info/blog/2010/12/08/money/#comments</comments>
		<pubDate>Wed, 08 Dec 2010 04:57:56 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=463</guid>
		<description><![CDATA[In some circles it&#8217;s taken for granted that money corrupts. But how much for granted should we be taking it? For instance, last month Lisa Rosenberg wrote for Sunlight that without additional election spending disclosure we are headed toward the &#8220;corruption of our democracy by secret campaign spending.&#8221; And MAPLight.org describes its mission as being [...]]]></description>
			<content:encoded><![CDATA[<p>In some circles it&#8217;s taken for granted that money corrupts. But how much for granted should we be taking it?</p>
<p>For instance, last month Lisa Rosenberg <a href="http://blog.sunlightfoundation.com/2010/11/18/sunlight-urges-congres-to-pass-streamlined-disclose-act-during-lame-duck/">wrote</a> for Sunlight that without additional election spending disclosure we are headed toward the &#8220;corruption of our democracy by secret campaign spending.&#8221; And MAPLight.org <a href="http://maplight.org/about">describes its mission</a> as being a watchdog for when &#8220;[e]lected officials collect large sums of money to run their campaigns, and they often pay back campaign contributors with special access and favorable laws.&#8221;</p>
<p>I certainly don&#8217;t doubt that money can corrupt, especially systemically. My favorite open secret is that Members of Congress are <a href="http://www.jstor.org/pss/3219894">assigned to committee</a> in part by how well they have fund-raised for the party (which to me sounds like a simple bribe).</p>
<p>But where I get worried is when an organization&#8217;s reporting arm gets caught up in reporting only on one side, making the body of evidence appear to support that corruption is wide-spread when in fact it is the exception rather than the rule, let alone a systemic problem.</p>
<p>What prompted me to write this was actually a blog post over at the Center for Responsive Politics that exemplifies exactly the type of reporting that is often missing. Megan Wilson <a href="http://www.opensecrets.org/news/2010/12/gm-cut-big-checks-to-lawmakers.html">writes on OpenSecrets</a> today that &#8220;General Motors&#8217; Political Committee Cut Big Checks to Lawmakers Who Voted Against Company&#8217;s Bailout.&#8221; Wison calls it &#8220;ironic,&#8221; that GM&#8217;s PAC seemed to be promoting candidates against its own interests. Well, not exclusively but at least two-to-one (&#8220;$63,500 to [congressmen] who voted against federal assistance for the company. That&#8217;s more than one-third of the overall amount GM gave to all House candidates this election cycle.&#8221;).</p>
<p>Ironic is one way to look at it. But more interesting to me is that of all times you might think we would see some easily understood evidence of a corrupting influence of money, lo and behold we see clearly that that&#8217;s not the case.</p>
<p>There is hope for our system after all.</p>
<p>I&#8217;m going to take a little shot at a post Wilson wrote in September, &#8220;<a href="http://www.opensecrets.org/news/2010/09/media-professionals-and-journalists-donate.html">Journalists, Media Professionals Donating Frequently to Federal Political Candidates this Election Cycle</a>&#8220;. She wrote, &#8220;235 people &#8230; identified themselves on government documents as journalists, or as working for news organizations, who together have donated more than $469,900 to federal political candidates, committees and parties during the 2010 election cycle &#8230; with the median amount donated coming in at $500.&#8221;</p>
<p>As she noted, many of the donations came from those employed by &#8220;lighter fare&#8221; such as ESPN, or were employed in a non-reporting (i.e. business) role. She provided a spreadsheet of the numbers she used. When I looked over it, to me it appeared as if around half of the contributions were from individuals whose job description clearly indicated there was no conflict of interest: science writers, radio talk show hosts, etc.</p>
<p>If you&#8217;ll give me the benefit of the doubt here, or even if you don&#8217;t, we&#8217;re talking about about 100-200 people nation-wide who were journalists who <i>might</i> have made a conflict-of-interest mistake. That sounds pretty good to me! Where&#8217;s the reporting on the other 90,000 journalists that abstained from contributing to a candidate? Is there any substantial impact on reporting or on policymaking that resulted from any of these so-thought bad contributions? I doubt it. (If any of the contributions were in any sense nefarious, they are from people who do more political damage by what they report, rather than by who they give money to.)</p>
<p>So, those are my thoughts tonight.</p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2010/12/08/money/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The way to fight money in politics is to empower citizens to support candidates who don&#8217;t have so much money, not to limit spending</title>
		<link>http://razor.occams.info/blog/2010/07/28/citizens-united/</link>
		<comments>http://razor.occams.info/blog/2010/07/28/citizens-united/#comments</comments>
		<pubDate>Wed, 28 Jul 2010 14:22:18 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=451</guid>
		<description><![CDATA[I actually took a sigh of relief yesterday when the DISCLOSE Act failed to be passed in the Senate. This was one of many proposals to partially reverse the Supreme Court decision earlier this year in Citizens United v. FEC which opened the floodgates for direct corporate spending on electioneering. And, apparently corporations are already [...]]]></description>
			<content:encoded><![CDATA[<p>I actually took a sigh of relief yesterday when the <a href="http://www.govtrack.us/congress/bill.xpd?bill=h111-5175">DISCLOSE Act</a> <a href="http://www.govtrack.us/congress/bill.xpd?bill=s111-3628">failed to be passed in the Senate</a>. This was one of <a href="http://blog.sunlightfoundation.com/2010/01/27/legislation-intended-to-respond-to-citizens-united/">many proposals</a> to partially reverse the Supreme Court decision earlier this year in <a href="http://blog.sunlightfoundation.com/2010/01/21/how-the-citizens-united-case-affects-money-politics-and-transparency-as-we-know-it/">Citizens United v. FEC</a> which opened the floodgates for direct corporate spending on electioneering. And, apparently <a href="http://www.opensecrets.org/news/2010/07/republicans-thwart-new-campaign-fin.html">corporations are already taking advantage</a> of the decision. The bill would require corporate spending on ads to come along with disclosure of who funded the ad.</p>
<p>I&#8217;m wary of limitations on political speech, especially when the limitations are uneven. As the <a href="http://www.aclu.org/free-speech/disclose-act-passed-house-today-compromises-free-speech">ACLU wrote</a>:</p>
<blockquote><p>The [bill] includes an amendment obligating many advocacy organizations that wish to speak out on candidates and, in certain situations, political issues, to release the identities of many of their donors, while allowing a few large mainstream organizations to preserve the privacy of their donors. The amendment exempts organizations that have over 500,000 members, are over ten years old, have a presence in all 50 states and whose revenue from corporations and unions is less than 15 percent. By exempting larger mainstream organizations from certain disclosure requirements, the bill inequitably suppresses only the speech of smaller, more controversial organizations and compromises the anonymity of small donors.</p></blockquote>
<p>The discussion of the <em>Citizens United</em> decision and the DISCLOSE Act in the blogosphere has taken it for granted that money is bad for politics. Or, perhaps we should say monetary <em>inequality</em> is bad for politics. (And if that&#8217;s the problem, exactly what does DISCLOSE do to rectify that? I guess it makes big spenders think twice about spending money on politics because they will have to put their name behind it. But what if they are proud to put their name behind a candidate? What if it even becomes good publicity to put your name behind a candidate?) In any case, monetary inequality is only part of the picture. What&#8217;s missing from the discussion is looking at <em>how</em> the money buys influence.</p>
<p>Interlude.</p>
<p>Back in the 1970s, when IBM dominated the world of computers with their high-cost mainframes, entrepreneurs &#8212; later to start Apple &#8212; uprooted the link between money and access to computational power by reinventing the computer in creating the <em>personal computer</em> (democratizing computational power!). In the 1990s, the big institutional newspapers and the newly dominating cable news channels ruled the media scene, but in the early 2000s bloggers started to uproot the power of the &#8220;MSM&#8221; by decentralizing public discussion of news (and in some impressive cases even taking on the role of investigative journalists). In the late 2000s with the emergence of the <a href="http://www.opengovdata.org">Open Government Data</a> movement world-wide, we&#8217;re seeing an attempt to empower citizens by providing them with access to government information that only interests with well-paid lobbyists could access before.</p>
<p>End interlude.</p>
<p><em>We&#8217;re not powerless when it comes to the link between money and influence.</em> This leads us to an alternative approach to decoupling monetary inequality with political inequality. Rather than essentially reducing monetary inequality by force of law (by discouraging corporate spending), we should be looking at uprooting the link between money and influence.</p>
<p>Money buys influence through advertising, because the only way citizens learn about candidates is through advertising. What if we could undermine that pathway? What if there were other ways for citizens to learn about candidates where there was no pay-to-play? Entrepreneurs and civic hackers have done this type of thing before. Perhaps it&#8217;s time to focus our skills on this problem.</p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2010/07/28/citizens-united/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Printable Congressional District Maps: Behind The Scenes</title>
		<link>http://razor.occams.info/blog/2010/02/26/printable-congressional-district-maps-behind-the-scenes/</link>
		<comments>http://razor.occams.info/blog/2010/02/26/printable-congressional-district-maps-behind-the-scenes/#comments</comments>
		<pubDate>Fri, 26 Feb 2010 13:55:25 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>
		<category><![CDATA[Code]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=441</guid>
		<description><![CDATA[Today I&#8217;m releasing print-quality maps of congressional districts, with street-level detail and county border lines. This has been one of the most sought-after resources based on emails I&#8217;ve received over the last some four years and I don&#8217;t think you can find this anywhere else. (At least not comprehensively for the whole nation. Local state [...]]]></description>
			<content:encoded><![CDATA[<p>Today I&#8217;m releasing <a href="http://www.govtrack.us/congress/printablemaps.xpd">print-quality maps of congressional districts</a>, with street-level detail and county border lines. This has been one of the most sought-after resources based on emails I&#8217;ve received over the last some four years and I don&#8217;t think you can find this anywhere else. (At least not comprehensively for the whole nation. Local state clerk&#8217;s offices may have them. NationalAtlas.gov <a href="http://www.google.com/url?sa=t&amp;source=web&amp;ct=res&amp;cd=1&amp;ved=0CAYQFjAA&amp;url=http%3A%2F%2Fwww.nationalatlas.gov%2Fprintable%2Fcongress.html&amp;rct=j&amp;q=print+congressional+district+map&amp;ei=YNiGS6CvJITO8QbnibGYDw&amp;usg=AFQjCNFKBdAaXwTWi3ddjLyCR91YSOttAg&amp;sig2=LiYGqUl9jL_0-BaB6FMhZA">has maps</a> but not with very much detail.)</p>
<p>This was a solid 2-day project with less than 300 lines of code and it&#8217;s something that only recently became this easy to do. I used Amazon Web Services (AWS), Census TIGER/Line cartographic data in an <a href="http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2367">AWS pubic data snapshot</a>, <a href="http://www.openstreetmap.org/">OpenStreetMap</a> for the street detail in an <a href="http://mapbox.com/data/osm-planet">AWS snapshot prepared by MapBox.com</a>, <a href="http://www.mapnik.org">Mapnik</a> to render the maps (pre-installed on an <a href="http://mapbox.com/tools">AWS machine image prepared by MapBox</a>), and the Python modules osgeo (for OGR) and PIL.</p>
<p>Here&#8217;s what  I did. This took a lot of trial and error, but in the end the steps were relatively simple.</p>
<p>Setting up the EC2 instance and the OpenStreetMap (OSM) planet data:</p>
<ul>
<li>Start up a new Amazon EC2 Linux instance using the AWS machine image (AMI) prepared by MapBox linked above.</li>
<li>Create Amazon Elastic Block Storage (EBS) volumes for the two data sets (OpenStreetMap and Census TIGER/Line) in the same availability zone as the EC2 instance. If you do it in the AWS console, you&#8217;ll just need to search for the snapshots by ID or name (see the links above).</li>
<li>Attach the two volumes to the running EC2 instance as /dev/sdf (OSM) and /dev/sdg (TIGER).</li>
<li>Log into the EC2 instance with SSH.</li>
<li>Mount the two volumes: mkdir /mnt/osm; mount -t ext3 /dev/sdf /mnt/osm; mkdir /mnt/tiger; mount -t ext3 /dev/sdg /mnt/tiger</li>
<li>Following the MapBox <a href="http://mapbox.com/osm-planet/using-osm-planet-ebs">instructions</a>, attach the OSM data to Postgres, change the Postgres configuration to remove password protection, and restart Postgres.</li>
</ul>
<p>To set up Mapnik, I followed the OpenStreetMap <a href="http://wiki.openstreetmap.org/wiki/Mapnik">wiki</a> which shows how to reuse their map styling. Most of the steps can be skipped because the data has already been set up in Postgres by MapBox. That involved:</p>
<ul>
<li>Getting the OSM Mapnik files from their SVN repository.</li>
<li>Downloading some extraneous boundary information.</li>
<li>Create a new style definition that controls how map features are rendered based on the OSM defaults.</li>
<li>Editing the defaults a) so it actually works, and b) so it looks good at high DPI for printing (increasing font sizes, removing some icons). This took a lot of trial and error since I didn&#8217;t understand what was going on and regenerating a map takes some time.</li>
</ul>
<p>The last step was to write a Python script that invokes Mapnik for each congressional district and generates a high-resolution map image.</p>
<ul>
<li>The Census&#8217;s TIGER/Line cartographic data has a Shapefile-format file for each state containing the congressional districts in the state. The osgeo/OGR Python module can load the file and tell you the latitude/longitude bounds of the congressional district (among other things).</li>
<li>Then the Mapnik Python bindings are used to create a new map with the given size, loading in the OSM street data.</li>
<li>Additional layers are added from the TIGER/Line data for place names (CDPs and county subdivisions if you&#8217;re familiar with Census data), county names and borders, state borders (and shading of other states), and the boundaries of the congressional district itself and shading of other congressional districts.</li>
<li>After rendering the map, which takes ~30 seconds, I used the Python Imaging Library module to add header and footer text with a nice translucent effect.</li>
</ul>
<p>Generating the maps at three resolutions for all of the congressional districts (except districts at-large) took several hours. I let it run overnight. They&#8217;re stored on Amazon S3 (the s3cmd tool is really useful for that).</p>
<p>There&#8217;s still room for a lot of improvement. After playing with the style instructions I got too much local road detail that in some places just ruins the whole map at low resolution. And in many places the county names aren&#8217;t showing up. Maybe because there&#8217;s too much detail. It&#8217;ll take some more trial and error to fix.</p>
<p>The source code (which includes all of the preparation steps in detail) is posted <a href="http://razor.occams.info/code/repo/?/viz/districtmaps/printablemaps.py">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2010/02/26/printable-congressional-district-maps-behind-the-scenes/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Who&#8217;s been visiting our Assistant Deputy CTO for Open Government?</title>
		<link>http://razor.occams.info/blog/2010/02/02/whos-been-visiting-our-assistant-deputy-cto-for-open-government/</link>
		<comments>http://razor.occams.info/blog/2010/02/02/whos-been-visiting-our-assistant-deputy-cto-for-open-government/#comments</comments>
		<pubDate>Tue, 02 Feb 2010 23:35:12 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=436</guid>
		<description><![CDATA[The White House began publishing its visitor logs &#8212; with sensitive information removed. Honestly, I don&#8217;t really get what the big hubub is over this information. First, are corrupting influences on the administration really going to stop corrupting because of this? And, a corollary, who exactly is in a position to be reading over the [...]]]></description>
			<content:encoded><![CDATA[<p>The White House began publishing its visitor logs &#8212; with sensitive information removed.</p>
<p>Honestly, I don&#8217;t really get what the big hubub is over <a href="http://www.whitehouse.gov/briefing-room/disclosures/visitor-records">this information</a>. First, are corrupting influences on the administration really going to stop corrupting because of this? And, a corollary, who exactly is in a position to be reading over the records to make sure nothing bad is going on? Who are these visitors anyway?</p>
<p>But I always enjoy playing with data all the same. To do it with a little levity, I thought I would profile Robynn Sturm&#8217;s visitors. I met Robynn recently and certainly got the feeling that of all people to hold the title of Assistant Deputy Chief Technology Officer for Open Government for the United States of America, she seemed like a good person to hold the job.</p>
<p>Anyway, in September-October 2009, she had 35 visits. I only have the names and can&#8217;t be sure of who they are, but I&#8217;ll do my best to give Google search results that might be reasonable. Text comes from the pages I&#8217;ve linked to.</p>
<p>Ellen Alberding and Gretchen Sims are the president and <span id="main" style="visibility: visible;"><span id="search" style="visibility: visible;">education program manager, respectively,</span></span> of the <a href="http://www.joycefdn.org/">Joyce Foundation</a>, which supports efforts to protect the natural  										environment of the Great Lakes, to reduce poverty and violence in the region,  										and to ensure that its people have access to good schools, decent jobs, and a  										diverse and thriving culture. Sims <a href="http://fundrace.huffingtonpost.com/neighbors.php?type=name&amp;lname=Sims&amp;fname=Gretchen">donated</a> to the Democrats in the last two presidential elections. (I wouldn&#8217;t have mentioned it except that that&#8217;s how I found out where she worked.)</p>
<p>Ethan Batraski (@<a href="http://twitter.com/ethanjb">ethanjb</a>): startup co-founder, mathematician, machine learning researcher, techanista, sharing thoughts on product management, startups, venture funding &amp; semantic web</p>
<p><a href="http://www.politico.com/news/stories/1109/29002.html">Marc Berejka</a> worked in senior government affairs roles at Microsoft, including eight years as a lobbyist for the high-tech giant. Says Politico: &#8220;Opponents of the Obama administration’s position on patent reform say that David Kappos and Marc Berejka, who recently took top jobs in the Commerce Department, are wielding too much influence over a policy that stands to benefit both of their former companies.&#8221;</p>
<p>Lawrence Brandt, a co-editor of <a href="http://www.springerlink.com/content/978-0-387-71610-7">Digital Government</a>, is a program manager within NSF.</p>
<p>Gerard Fiala is the staff director in the Senate HELP committee&#8217;s subcommittee on employment and workplace safety.</p>
<p><a href="http://www.seenajon.com/SeenaJon.com/Home.html">Seena Jon Ghaznavi</a> (<a href="https://twitter.com/SJgood">@sjgood</a>) is a young actor in the movie Death of a President.</p>
<p>Michael Harding &#8211; This name is too popular.</p>
<p>Greg Horowitt and Victor Hwang are co-founders and Managing Directors of <a href="http://www.t2vc.com/team.htm">T2 Venture Capital</a>,  	a venture fund focused on breakthrough technology spinning out of government  	and academia. Horowitt is also Director and Co-Founder of the <a href="http://globalconnect.ucsd.edu/about/bio-ghorowitt.cfm">Global CONNECT</a> program based at the University of California, San Diego, and is a key thought leader in the field of ‘innovation systems’, and their relevant applications for sustainable regional economic development through technology commercialization.</p>
<p>Ester Lee might be the Ester Lee that <a href="http://www.adweek.com/aw/content_display/news/e3if6074d641b5e57bfc13b87ca876a0d41">works for AT&amp;T</a>. But maybe not.</p>
<p>Joseph Mancio &#8211; another somewhat popular name.</p>
<p><a href="http://almostlegally.com/">Dominic Mauro</a> (<a href="http://twitter.com/mynameisdom">@mynameisdom</a>) is a TA for <a href="http://james.grimmelmann.net/courses/internet/">Internet Law</a> at NY Law School (which, for reference, is the school that Beth Noveck comes from; Beth is Robynn&#8217;s boss).</p>
<p>Sara Mirsky is the American Constitutional Society&#8217;s <a href="http://">NYLS chapter</a>&#8216;s co-president. (See note about about NYLS).</p>
<p>Courtney Patterson (<a href="http://twitter.com/cnpatterson">@cnpatterson</a>) is a o<span>bsessive-compulsive law student in NYC. (We probably know at which school.)<br />
</span></p>
<p>Gina Wells is another common name.</p>
<p>Phillip Wickham is President and CEO of the <a href="http://www.kauffmanfellows.org/s/267/bio2.aspx?pgid=1135">Kauffman Fellows Program</a> at the Center for Venture Education in Palo Alto, CA. The mission of the Kauffman Fellows Program is to develop the next generation of leaders in venture capital.</p>
<p>John Bell is way too common a name.</p>
<p>Pamela Frugoli works for the <a href="http://www.outreach.psu.edu/programs/etli-2009/panelists.htm">Department of Labor</a>.</p>
<p>Daniel Gomez might be a <a href="http://www.kirkland.com/sitecontent.cfm?contentID=220&amp;itemID=9599">lawyer</a>.</p>
<p>Melissa Sperry is too common a name.</p>
<p>Meredith Stewart is another popular name.</p>
<p>Haley van Dyck was a part of the Obama 08 campaign team, according to her <a href="http://www.linkedin.com/pub/marianne-manilov/4/b53/a8b">step-mom&#8217;s LinkedIn page</a>, who, btw, is proud of her.</p>
<p>Jing Vivatrat is either an <a href="http://jsv-i.com/content/view/14/29/">m&amp;a businesswoman</a> or an <a href="http://www.c-spanvideo.org/person/9265006">FCC director</a>, or both.</p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2010/02/02/whos-been-visiting-our-assistant-deputy-cto-for-open-government/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>GovTrack Insider</title>
		<link>http://razor.occams.info/blog/2010/01/24/govtrack-insider/</link>
		<comments>http://razor.occams.info/blog/2010/01/24/govtrack-insider/#comments</comments>
		<pubDate>Sun, 24 Jan 2010 23:43:37 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=434</guid>
		<description><![CDATA[Last week I launched GovTrack Insider, a spin-off of GovTrack.us that complements GovTrack with original and syndicated reporting of the U.S. Congress. I&#8217;m sending reporters to congressional committee markups to get as early of a peek into the public component of the legislative process as we can. After five years of run­ning GovTrack.​us, I saw [...]]]></description>
			<content:encoded><![CDATA[<p>Last week I launched <a href="http://www.govtrackinsider.com">GovTrack Insider</a>, a spin-off of <a href="http://www.govtrack.us">GovTrack.us</a> that  complements GovTrack with original and syndicated reporting of the U.S.  Congress. I&#8217;m sending reporters to congressional committee markups to  get as early of a peek into the public component of the legislative  process as we can.</p>
<p>After five years of run­ning GovTrack.​us, I saw a need to move be­yond data. As we&#8217;ve seen over at the <a href="http://www.opencongress.org">OpenCongress.org</a> blog, reporting is an important part of following the legislative process, staying ahead of the game, and understanding the ramifications of procedural events. GovTrackInsider.​com puts a focus on grass-roots reporting of the legislative process.</p>
<p>GovTrack Insider has hired a small team of paid free­lance re­porters to cover con­gres­sion­al com­mit­tee meet­ings. This is the ear­li­est  point in the leg­isla­tive pro­cess that we can ob­serve di­rect­ly, and  it&#8217;s going to help us get an inside look like we&#8217;ve never had before.  We&#8217;ll also be syndicating coverage from OpenCongress and elsewhere. In  all of the articles we run, the focus will be on legislation and policy.  We&#8217;ll be leaving politics and gossip to the old media.</p>
<p>In­sid­er is an open-access on­line news­pa­per, but one with some  un­usu­al pages. For ar­ti­cles on many top­ics you&#8217;ll find on the right  side a topic dash­board. There you can con­nect with other read­ers in a  va­ri­ety of ways, such as a Q&amp;A system, link submission, and a  community notebook (a wiki).</p>
<p>In time we&#8217;ll be in­te­grat­ing In­sid­er more with GovTrack.​us so that  if you&#8217;re track­ing a bill over on Gov­Track you&#8217;ll get a no­tice  when­ev­er it&#8217;s men­tioned in an ar­ti­cle in GovTrack Insider.</p>
<p>I hope you&#8217;ll join us on this ex­per­i­ment. Happy read­ing.</p>
<p>(GovTrack Insider was built with the help of Josh Sulkin (of FlyOnTime.us fame).)</p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2010/01/24/govtrack-insider/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>GAH09 Philadelphia Hackathon: New Jersey Gang Survey Viewer</title>
		<link>http://razor.occams.info/blog/2009/12/14/gah09/</link>
		<comments>http://razor.occams.info/blog/2009/12/14/gah09/#comments</comments>
		<pubDate>Tue, 15 Dec 2009 01:40:20 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=427</guid>
		<description><![CDATA[The New Jersey Gang Survey Viewer is a visualization tool for the New Jersey State Police Street Gang Survey 2007. The site was developed by five volunteers over this past weekend, with the help of New Jersey State Police analysts. Our goal was to elevate public knowledge about street gang presence in New Jersey, USA, [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://njgangsurvey.civicimpulse.com/">New Jersey Gang Survey Viewer</a> is a visualization tool for the New Jersey State Police Street Gang Survey 2007. The site was developed by five volunteers over this past weekend, with the help of New Jersey State Police analysts. Our goal was to elevate public knowledge about street gang presence in New Jersey, USA, based on the NJSP&#8217;s 2007 survey of the 560+ municipalities in the state. The NJSP analysts approached me shortly before our <a href="http://www.sunlightlabs.com/hackathon09/">Great American Hackathon</a> meet-up in Philadelphia was to occur, and our group eagerly saw this project as a great way to work with a government data provider on an app that we think will elevate public knowledge in an important area.</p>
<p style="text-align: center;"><a href="http://razor.occams.info/blogimages/njgangsurvey1.png"><img class="aligncenter" title="NJ Gang Survey Viewer Screenshot" src="http://razor.occams.info/blogimages/njgangsurvey1.png" alt="" width="400" height="311" /></a><span id="more-427"></span></p>
<p>NJSP analyst Dean Baratta, who along with his boss joined us in Philadelphia on the first day of our work, <a href="http://iago18335.wordpress.com/2009/12/14/more-visibility-of-gangs-in-new-jersey/">has this to say today</a>:</p>
<blockquote><p>The amount of work these guys did in two days is really incredible given the shape of the underlying data.  It was fine for researchers and academics but not suitable at all for the general public. . . . Not long ago, I mentioned on another site that the democratization of information and the ability of people to collaborate who (in the pre-internet days) would not even known of each others existence was one of the global issues which makes me feel optimistic.  This was the type of work that I was thinking of when I wrote that. Check out the work these guys did and then start asking your local and state law enforcement agencies why they don’t make information like this available to the public.</p></blockquote>
<p>This is a work in progress. We accomplished all we could in two days. This included a large amount of work learning about the survey, normalizing the survey data, linking it to geospacial data, and learning Django and GIS web technology so we could create a mapping website rapidly. The site displays gang statistics for the municipalities in New Jersey and maps of the locations of the major gangs present in the state.</p>
<p>The NJ Gang Survey Viewer was created by volunteer civic hackers Nick Cazoneri (<a href="http://www.dvrpc.org/">Delaware Valley Regional Planning Commission</a>), Robert Cheetham (<a href="http://avencia.com/">Avencia</a>), Don Coleman (<a href="http://chariotsolutions.com/">Chariot Solutions</a>), David Middlecamp (<a href="http://avencia.com/">Avencia</a>), and Joshua Tauberer (<a href="http://www.civicimpulse.com/">Civic Impulse</a>), and from the <a href="http://www.njsp.org/">New Jersey State Police</a> Dean Baratta, intelligence analyst, and Peter Lynch.  We were hosted for the weekend in the <a href="http://avencia.com/">Avencia</a> office in downtown Philadelphia. Here we are (besides me behind the camera and David who came a bit later):</p>
<p style="text-align: center;"><a href="http://razor.occams.info/blogimages/njgangsurvey2.png"><img class="aligncenter" title="The NJ Gang Survey Viewer Team" src="http://razor.occams.info/blogimages/njgangsurvey2.png" alt="" width="400" height="300" /></a></p>
<p><a href="http://razor.occams.info/blogimages/njgangsurvey3.png"><img class="aligncenter" title="The NJ Gang Survey Viewer Team In Action" src="http://razor.occams.info/blogimages/njgangsurvey3.png" alt="" width="400" height="300" /></a></p>
<p>We mostly didn&#8217;t know each other before the weekend, but we all got along great. Our weekend work was the Philadelphia version of the <a href="http://www.sunlightlabs.com/hackathon09/">Great American Hackathon 2009</a>, a network of similar civic technology meet-ups held throughout the nation and loosely organized by <a href="http://www.sunlightfoundation.com/">Sunlight Foundation</a>, a DC-based government transparency nonprofit foundation.</p>
<p>David, one of our team members, says:</p>
<blockquote><p>I enjoyed getting to know the group and working on something that normally doesn&#8217;t get the technical attention that it should.  Learning a new platform and contributing to the greater good, a fun weekend.  :)</p></blockquote>
<p>If you are interested in working with New Jersey gang and crime data, you can contact me. The NJSP would like to continue this effort.</p>
<p>This gang survey data is relevant to a wide range of New Jersey residents and visitors to the state. The current perception of ‘gang threat’ is frequently one that is primarily urban and particularly violent. This has implications for both government and society at large. The belief that gangs are someone else’s problem — and someone else’s tax burden — could potentially reduce public support for anti-gang initiatives that go beyond an initial impulse to “lock ‘em all up.” Gangs are reported present in dozens of rural and suburban municipalities throughout the state. Almost seven out of every ten New Jerseyans live in a municipality where gangs can be found. Clearly, gangs can not be considered an exclusively urban phenomenon in any part of New Jersey.</p>
<p>Our work this weekend was a great experience and is proof that technologists are willing and able to work with government agencies on a volunteer basis to develop civic applications that the agencies don&#8217;t have the mandate or funding for themselves.</p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2009/12/14/gah09/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Open Government Directive Evaluation on Principles</title>
		<link>http://razor.occams.info/blog/2009/12/09/open-government-directive-evaluation-on-principles/</link>
		<comments>http://razor.occams.info/blog/2009/12/09/open-government-directive-evaluation-on-principles/#comments</comments>
		<pubDate>Wed, 09 Dec 2009 12:44:03 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=425</guid>
		<description><![CDATA[Yesterday the White House&#8217;s chief technologists unveiled the Open Government Directive (OGD). The OGD mainly covers two aspects of government transparency: using technology as a tool for data sharing and public participation in agency decision-making. We&#8217;ve seen the start of a culture shift this year in the executive branch, in parallel with actual progress throughout [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday the White House&#8217;s chief technologists unveiled the <a href="http://www.whitehouse.gov/open">Open Government Directive</a> (OGD). The OGD mainly covers two aspects of government transparency: using technology as a tool for data sharing and public participation in agency decision-making. We&#8217;ve seen the start of a culture shift this year in the executive branch, in parallel with actual progress throughout the country, and now the OGD outlines and codifies a vision for the next four months and, well, beyond.</p>
<p>Last week I <a href="http://razor.occams.info/blog/2009/12/01/congressional-disbursements/">reviewed</a> the House&#8217;s Statement of Disbursement electronic document along the dimensions of <a href="http://razor.occams.info/pubdocs/opendataciviccapital.html">open government data</a> (which unfortunately has the same abbreviation). The OGD talks about how agencies should go about the process of opening data. Here is a review of what the OGD says, organized by open data principle.</p>
<p>I&#8217;ll put the conclusion up front: The OGD addresses nearly all of the open government data principles that have been put forward, and even adds two of its own: being pro-active about data release and creating accountability by designating an official responsible for data quality (more on these at the end). So from this perspective, the OGD is pretty spot-on. It is very strong in public input, public review, and interagency coordination, which are normally the weakest spots of government data (but, on the other hand, this isn&#8217;t data, this is a goal, so the proof will be in the pudding). It could have been stronger in the areas of machine processability &#038; promoting analysis, and explaining what is appropriate for data licensing (ideally, none).</p>
<p>Here are the details:</p>
<p>Information is not meaningfully public if it is not available <strong>on the Internet for free</strong>.</p>
<p>The OGD says &#8220;each agency shall take prompt steps to expand access to information by making it available online in open formats.&#8221; The OGD itself doesn&#8217;t say free, but executive branch policy already requires that public information not be sold to the public at more than the marginal cost of distribution &#8212; which is about as good as one might expect. So we&#8217;ll count this principle as asserted by the OGD.</p>
<p>   &#8212; GOOD</p>
<p><strong>Data Should Be Primary</strong>. Primary data is data as collected at the source, with the finest possible level of granularity, not in aggregate or modified forms.</p>
<p>In the OGD&#8217;s appendix where it outlines further details for agencies, it says agencies should release data &#8220;as granular as possible&#8221;.</p>
<p>   &#8212; GOOD</p>
<p><strong>Timely</strong>.</p>
<p>The OGD days, &#8220;Timely publication of information is an essential component of transparency. Delays should not be viewed as an inevitable and insurmountable consequence of high demand.&#8221;</p>
<p>   &#8212; GOOD</p>
<p><strong>Accessible</strong>. Data are available to the widest range of users for the widest range of purposes, meaning use an open standard, with a bulk download, and with documentation.</p>
<p><strong>Machine processable</strong>: Data are reasonably structured to allow automated processing.</p>
<p>The OGD specifically defines &#8220;open format&#8221; &#8212; which is the subject of the directive &#8212; as something that is platform independent and machine readable. Now, here the OGD slips a little because it redefines &#8220;open&#8221; but actually leaves out open standards. I don&#8217;t think that was intentional, so we&#8217;ll give the OGD credit for mentioning open standards even though it didn&#8217;t exactly. It mentions &#8220;downloadable&#8221; but not in bulk, and there is no mention of documentation in the OGD. We can&#8217;t tell what the OGD meant by &#8220;machine readable&#8221; &#8212; I think of this term now a sloppy form of &#8220;machine processable&#8221;. It would have helped if the OGD specifically noted that the point is to support analysis and reuse of the data.</p>
<p>I used to use &#8220;machine readable&#8221; until someone corrected me that really any format can be read by a machine. The question is what the machine can do with it: to what degree can the data be meaningfully processed by a machine? So now I use machine-processable.</p>
<p>&#8211; WEAK</p>
<p><strong>Non-discriminatory</strong>: Data are available to anyone, with no requirement of registration.<br />
<strong>Non-proprietary</strong>: Data are available in a format over which no entity has exclusive control.<br />
<strong>License-free</strong>. Dissemination of the data is not limited by intellectual property law or other terms.</p>
<p>The OGD says data must be &#8220;made available to the public without restrictions that would impede the re-use of that information.&#8221; Here we could have really benefited from some simple but concrete guidance.</p>
<p>   &#8212; WEAK</p>
<p><strong>Promote analysis</strong>: Data published by the government should be in formats and approaches that promote analysis and reuse of that data.</p>
<p>There is a sense in which this is implicit in the OGD, but maybe it is the goggles through which I read it. The OGD fails to say explicitly that analysis is the whole point of open government data.</p>
<p>   &#8212; FAIL</p>
<p><strong>Public input</strong>: The public is in the best position to determine what information technologies will be best suited for the applications the public intends to create for itself.</p>
<p><strong>Public review</strong>: There should be a means for the public to interact with the data publisher during and after the data has been made. The public may have questions or may find errors. The process of creating the data should also be transparent.</p>
<p>These principles are perhaps the least commonly addressed, and yet it is one of the most prominent aspects of the OGD. The OGD requires agencies to allow the public to give feedback on data quality, data prioritization, and other aspects of the agency&#8217;s OGD plan. In fact, the OGD says, &#8220;Each agency shall respond to public input received on its Open Government Webpage on a regular basis.&#8221;</p>
<p>In addition, the OGD will form a working group (described next) that will discuss &#8220;ideas to promote<br />
participation and collaboration, including how to &#8230; take advantage of the expertise and insight of people both inside and outside the Federal Government, and form high-impact collaborations with researchers, the private sector, and civil society.&#8221;</p>
<p>In the appendix where it outlines further goals for agencies, the OGD says, &#8220;Your agency should also identify key audiences for its information and their needs, and endeavor to publish high-value information for each of those audiences in the most accessible forms and formats.&#8221;</p>
<p>   &#8212; EXCELLENT</p>
<p><strong>Interagency coordination</strong>: Interoperability makes data more valuable by making it easier to derive new uses from combinations of data. To the extent two data sets refer to the same kinds of things, the creators of the data sets should strive to make them interoperable.</p>
<p>The OGD will establish a working group lead by the Deputy Director for Management at OMB, the Federal Chief Information Officer, and the Federal Chief Technology Officer to provide a forum to share best practices for data collection, aggregation, validation, and dissemination throughout the government, to coordinate implementations of federal spending transparency, and to provide a forum for sharing best practices for participation.</p>
<p>   &#8212; EXCELLENT</p>
<p><strong>Provenance and trust:</strong> Published content should be digitally signed or include attestation of publication/creation date, authenticity, and integrity.</p>
<p><strong>Permanent Web Address</strong>: The file should have a stable location.</p>
<p><strong>Safe file formats</strong>: Government bodies publishing data online should always seek to publish using data formats that do not include executable content.</p>
<p><strong>Globally Unique Identifiers</strong>: This concept, important on the world wide web, is that any document, resource, data record, or entity mentioned in a database, or some might say every paragraph in a document, should have a unique identification that others can use to point to or cite it elsewhere.</p>
<p><strong>Linked Open Data</strong>: This is a method for publishing databases in a standard format for interconnectivity with other databases without the expense of wide agreement on unified inter-agency or global data standards.</p>
<p>These get into some of the more precise details of data format. I might have liked to see provenance &#038; trust addressed, but I am not sure whether I would really expect these principles to be included in a high level 120-day plan, at least not at this point. So their absence is not something I hold against the OGD. Still:</p>
<p>   &#8212; FAIL</p>
<p><strong>Other Notes</strong></p>
<p>The OGD talks about being <strong>proactive</strong> with data release.</p>
<p>The OGD also adds <strong>accountability</strong>: &#8220;Each agency &#8230; shall designate a high-level senior official to be accountable for the quality and objectivity of, and internal controls over, the Federal spending information publicly disseminated.&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2009/12/09/open-government-directive-evaluation-on-principles/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Congressional Disbursements Data Release Evaluation</title>
		<link>http://razor.occams.info/blog/2009/12/01/congressional-disbursements/</link>
		<comments>http://razor.occams.info/blog/2009/12/01/congressional-disbursements/#comments</comments>
		<pubDate>Tue, 01 Dec 2009 16:45:52 +0000</pubDate>
		<dc:creator>Joshua Tauberer</dc:creator>
				<category><![CDATA[Civic Hacking]]></category>

		<guid isPermaLink="false">http://razor.occams.info/blog/?p=419</guid>
		<description><![CDATA[This week the House of Representatives began publishing disbursements online. Disbursements include how much congressmen and their staffs are paid, what kinds of expenses they have, and who they are paying for these services. This is a really great case study in how to do transparency. There are a lot of wins here and several [...]]]></description>
			<content:encoded><![CDATA[<p>This week the House of Representatives began publishing <a href="http://disbursements.house.gov">disbursements online</a>. Disbursements include how much congressmen and their staffs are paid, what kinds of expenses they have, and who they are paying for these services. This is a really great case study in how to do transparency. There are a lot of wins here and several points to learn from.</p>
<p>The best thing I see so far is the documentation provided on disbursements.house.gov. There is a nice explanation of the reporting process, a FAQ, and a glossary. There is also a table of transaction codes found in the document, and these all are crucial for anyone reading or analyzing the information. This is one of the best examples of documentation I&#8217;ve seen for government data of this kind.</p>
<p>Here&#8217;s an evaluation of the disclosure based on standards I&#8217;ve drawn from others and outlined <a href="http://razor.occams.info/pubdocs/opendataciviccapital.html">here</a>.</p>
<p>In summary, many of the goals are met. The important ones not met are machine processability, public input, and public review. Machine processability is a very important one and the fact that this goal was not met seriously undermines much of the reason for publishing the information in the first place.</p>
<p>Information is not meaningfully public if it is not available <strong>on the Internet for free</strong>.</p>
<p>   &#8212; GOAL ACHIEVED.</p>
<p><strong>Data Should Be Primary</strong>. Primary data is data as collected at the source, with the finest possible level of granularity, not in aggregate or modified forms.</p>
<p>We can evaluate this because the documentation actually describes how the House Clerk receives the information. It talks about some degree of aggregation taking place, such as a $10,000 travel record not necessarily being for one trip. But by and large we&#8217;re seeing the level of detail that I think is expected.</p>
<p>&#8211; GOAL ACHIEVED</p>
<p><strong>Timely</strong>.</p>
<p>The SOD is published quarterly, and hopefully this will include the online/electronic version going forward. I don&#8217;t think we can have too much higher of an expectation here. Ten years from now perhaps I would like to see real-time expense reporting, but not today.</p>
<p>&#8211; TO BE SEEN (if the electronic version is published as often as the print version in the future)</p>
<p><strong>Accessible</strong>. Data are available to the widest range of users for the widest range of purposes.</p>
<p>This goal from the &#8220;8 Principles&#8221; refers to:</p>
<p>Data Format: PDF is an open standard, and therefore a good choice among the data formats for the purpose of publishing a document. (See more below.)</p>
<p>Bulk Data: The 3,400-page document is provided in a single PDF file, rather than hundreds or thousands of separate downloads. This satisfies the goal of bulk data.</p>
<p>Documentation: Documentation is excellent. There is an explanation of the reporting process, a FAQ, a table of codes, and a glossary.</p>
<p>&#8211; GOAL ACHIEVED</p>
<p><strong>Machine processable</strong>: Data are reasonably structured to allow automated processing.</p>
<p>This is the first goal which is not addressed at all by the data release. While PDF is good for documents, it is bad for tabular information. It does not support sorting, transforming, or other analysis, and it only marginally supports search. A spreadsheet format of any sort would be useful here, some formats better than others.</p>
<p>Considering the size of this data set, without the help of computers to process this information it is far less useful than it could be. To be given a barely-searchable 3,000-page file is only a small step up from being mailed several reams of paper.</p>
<p>Furthermore, there is no indication on the disbursements website that this will be considered in the future.</p>
<p>&#8211; GOAL NOT MET</p>
<p><strong>Non-discriminatory</strong>: Data are available to anyone, with no requirement of registration.<br />
<strong>Non-proprietary</strong>: Data are available in a format over which no entity has exclusive control.<br />
<strong>License-free</strong>. Dissemination of the data is not limited by intellectual property law or other terms.</p>
<p>&#8211; GOALS ACHIEVED</p>
<p><strong>Promote analysis</strong>: Data published by the government should be in formats and approaches that promote analysis and reuse of that data.</p>
<p>This goal (and the next two) comes from the Association of Computing Machinery&#8217;s Recommendation on Open Government and is similar to the Machine Processable goal above. So, see above.</p>
<p>&#8211; GOAL NOT MET</p>
<p><strong>Safe file formats</strong>: Government bodies publishing data online should always seek to publish using data formats that do not include executable content.</p>
<p>PDF is, relatively speaking, a safe file format.</p>
<p>&#8211; GOAL ACHIEVED</p>
<p><strong>Provenance and trust:</strong> Published content should be digitally signed or include attestation of publication/creation date, authenticity, and integrity.</p>
<p>According to the disbursements website, the files are digitally signed. I haven&#8217;t verified that the signature process was done correctly.</p>
<p>&#8211; GOAL ACHIEVED</p>
<p><strong>Public input</strong>: The public is in the best position to determine what information technologies will be best suited for the applications the public intends to create for itself.</p>
<p>I am sure members of our community have been in touch with the Clerk&#8217;s office. However, there was no public discussion on how these files ought to have been made available, and therefore I am going to not count this goal as having been met.</p>
<p>&#8211; GOAL NOT MET</p>
<p><strong>Public review</strong>: There should be a means for the public to interact with the data publisher during and after the data has been made. The public may have questions or may find errors. The process of creating the data should also be transparent.</p>
<p>This is a goal rarely given any attention. The documentation gives significant insight into this process. But there is no contact person for this data set that is made known to the public.</p>
<p>&#8211; GOAL NOT MET</p>
<p><strong>Interagency coordination</strong>: Interoperability makes data more valuable by making it easier to derive new uses from combinations of data. To the extent two data sets refer to the same kinds of things, the creators of the data sets should strive to make them interoperable.</p>
<p>There is a potential to link the names of Members of Congress to their ID numbers provided in, say, the XML voting records.</p>
<p>&#8211; GOAL NOT MET</p>
<p><strong>Permanent Web Address</strong>: The file should have a stable location.</p>
<p>&#8211; GOAL ACHIEVED (provided it is kept there)</p>
<p><strong>Globally Unique Identifiers</strong>: This concept, important on the world wide web, is that any document, resource, data record, or entity mentioned in a database, or some might say every paragraph in a document, should have a unique identification that others can use to point to or cite it elsewhere.</p>
<p>&#8211; GOAL NOT MET</p>
<p><strong>Linked Open Data</strong>: This is a method for publishing databases in a standard format for interconnectivity with other databases without the expense of wide agreement on unified inter-agency or global data standards.</p>
<p>&#8211; GOAL NOT MET</p>
]]></content:encoded>
			<wfw:commentRss>http://razor.occams.info/blog/2009/12/01/congressional-disbursements/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

