Archive for the ‘Civic Hacking’ Category

Posts related to my site GovTrack.us, which tracks the U.S. Congress, and related issues in the world of civics, technology, open government.

Printable Congressional District Maps: Behind The Scenes

Friday, February 26th, 2010

Today I’m releasing print-quality maps of congressional districts, with street-level detail and county border lines. This has been one of the most sought-after resources based on emails I’ve received over the last some four years and I don’t think you can find this anywhere else. (At least not comprehensively for the whole nation. Local state clerk’s offices may have them. NationalAtlas.gov has maps but not with very much detail.)

This was a solid 2-day project with less than 300 lines of code and it’s something that only recently became this easy to do. I used Amazon Web Services (AWS), Census TIGER/Line cartographic data in an AWS pubic data snapshot, OpenStreetMap for the street detail in an AWS snapshot prepared by MapBox.com, Mapnik to render the maps (pre-installed on an AWS machine image prepared by MapBox), and the Python modules osgeo (for OGR) and PIL.

Here’s what  I did. This took a lot of trial and error, but in the end the steps were relatively simple.

Setting up the EC2 instance and the OpenStreetMap (OSM) planet data:

  • Start up a new Amazon EC2 Linux instance using the AWS machine image (AMI) prepared by MapBox linked above.
  • Create Amazon Elastic Block Storage (EBS) volumes for the two data sets (OpenStreetMap and Census TIGER/Line) in the same availability zone as the EC2 instance. If you do it in the AWS console, you’ll just need to search for the snapshots by ID or name (see the links above).
  • Attach the two volumes to the running EC2 instance as /dev/sdf (OSM) and /dev/sdg (TIGER).
  • Log into the EC2 instance with SSH.
  • Mount the two volumes: mkdir /mnt/osm; mount -t ext3 /dev/sdf /mnt/osm; mkdir /mnt/tiger; mount -t ext3 /dev/sdg /mnt/tiger
  • Following the MapBox instructions, attach the OSM data to Postgres, change the Postgres configuration to remove password protection, and restart Postgres.

To set up Mapnik, I followed the OpenStreetMap wiki which shows how to reuse their map styling. Most of the steps can be skipped because the data has already been set up in Postgres by MapBox. That involved:

  • Getting the OSM Mapnik files from their SVN repository.
  • Downloading some extraneous boundary information.
  • Create a new style definition that controls how map features are rendered based on the OSM defaults.
  • Editing the defaults a) so it actually works, and b) so it looks good at high DPI for printing (increasing font sizes, removing some icons). This took a lot of trial and error since I didn’t understand what was going on and regenerating a map takes some time.

The last step was to write a Python script that invokes Mapnik for each congressional district and generates a high-resolution map image.

  • The Census’s TIGER/Line cartographic data has a Shapefile-format file for each state containing the congressional districts in the state. The osgeo/OGR Python module can load the file and tell you the latitude/longitude bounds of the congressional district (among other things).
  • Then the Mapnik Python bindings are used to create a new map with the given size, loading in the OSM street data.
  • Additional layers are added from the TIGER/Line data for place names (CDPs and county subdivisions if you’re familiar with Census data), county names and borders, state borders (and shading of other states), and the boundaries of the congressional district itself and shading of other congressional districts.
  • After rendering the map, which takes ~30 seconds, I used the Python Imaging Library module to add header and footer text with a nice translucent effect.

Generating the maps at three resolutions for all of the congressional districts (except districts at-large) took several hours. I let it run overnight. They’re stored on Amazon S3 (the s3cmd tool is really useful for that).

There’s still room for a lot of improvement. After playing with the style instructions I got too much local road detail that in some places just ruins the whole map at low resolution. And in many places the county names aren’t showing up. Maybe because there’s too much detail. It’ll take some more trial and error to fix.

The source code (which includes all of the preparation steps in detail) is posted here.

Who’s been visiting our Assistant Deputy CTO for Open Government?

Tuesday, February 2nd, 2010

The White House began publishing its visitor logs — with sensitive information removed.

Honestly, I don’t really get what the big hubub is over this information. First, are corrupting influences on the administration really going to stop corrupting because of this? And, a corollary, who exactly is in a position to be reading over the records to make sure nothing bad is going on? Who are these visitors anyway?

But I always enjoy playing with data all the same. To do it with a little levity, I thought I would profile Robynn Sturm’s visitors. I met Robynn recently and certainly got the feeling that of all people to hold the title of Assistant Deputy Chief Technology Officer for Open Government for the United States of America, she seemed like a good person to hold the job.

Anyway, in September-October 2009, she had 35 visits. I only have the names and can’t be sure of who they are, but I’ll do my best to give Google search results that might be reasonable. Text comes from the pages I’ve linked to.

Ellen Alberding and Gretchen Sims are the president and education program manager, respectively, of the Joyce Foundation, which supports efforts to protect the natural environment of the Great Lakes, to reduce poverty and violence in the region, and to ensure that its people have access to good schools, decent jobs, and a diverse and thriving culture. Sims donated to the Democrats in the last two presidential elections. (I wouldn’t have mentioned it except that that’s how I found out where she worked.)

Ethan Batraski (@ethanjb): startup co-founder, mathematician, machine learning researcher, techanista, sharing thoughts on product management, startups, venture funding & semantic web

Marc Berejka worked in senior government affairs roles at Microsoft, including eight years as a lobbyist for the high-tech giant. Says Politico: “Opponents of the Obama administration’s position on patent reform say that David Kappos and Marc Berejka, who recently took top jobs in the Commerce Department, are wielding too much influence over a policy that stands to benefit both of their former companies.”

Lawrence Brandt, a co-editor of Digital Government, is a program manager within NSF.

Gerard Fiala is the staff director in the Senate HELP committee’s subcommittee on employment and workplace safety.

Seena Jon Ghaznavi (@sjgood) is a young actor in the movie Death of a President.

Michael Harding – This name is too popular.

Greg Horowitt and Victor Hwang are co-founders and Managing Directors of T2 Venture Capital, a venture fund focused on breakthrough technology spinning out of government and academia. Horowitt is also Director and Co-Founder of the Global CONNECT program based at the University of California, San Diego, and is a key thought leader in the field of ‘innovation systems’, and their relevant applications for sustainable regional economic development through technology commercialization.

Ester Lee might be the Ester Lee that works for AT&T. But maybe not.

Joseph Mancio – another somewhat popular name.

Dominic Mauro (@mynameisdom) is a TA for Internet Law at NY Law School (which, for reference, is the school that Beth Noveck comes from; Beth is Robynn’s boss).

Sara Mirsky is the American Constitutional Society’s NYLS chapter’s co-president. (See note about about NYLS).

Courtney Patterson (@cnpatterson) is a obsessive-compulsive law student in NYC. (We probably know at which school.)

Gina Wells is another common name.

Phillip Wickham is President and CEO of the Kauffman Fellows Program at the Center for Venture Education in Palo Alto, CA. The mission of the Kauffman Fellows Program is to develop the next generation of leaders in venture capital.

John Bell is way too common a name.

Pamela Frugoli works for the Department of Labor.

Daniel Gomez might be a lawyer.

Melissa Sperry is too common a name.

Meredith Stewart is another popular name.

Haley van Dyck was a part of the Obama 08 campaign team, according to her step-mom’s LinkedIn page, who, btw, is proud of her.

Jing Vivatrat is either an m&a businesswoman or an FCC director, or both.

GovTrack Insider

Sunday, January 24th, 2010

Last week I launched GovTrack Insider, a spin-off of GovTrack.us that complements GovTrack with original and syndicated reporting of the U.S. Congress. I’m sending reporters to congressional committee markups to get as early of a peek into the public component of the legislative process as we can.

After five years of run­ning GovTrack.​us, I saw a need to move be­yond data. As we’ve seen over at the OpenCongress.org blog, reporting is an important part of following the legislative process, staying ahead of the game, and understanding the ramifications of procedural events. GovTrackInsider.​com puts a focus on grass-roots reporting of the legislative process.

GovTrack Insider has hired a small team of paid free­lance re­porters to cover con­gres­sion­al com­mit­tee meet­ings. This is the ear­li­est point in the leg­isla­tive pro­cess that we can ob­serve di­rect­ly, and it’s going to help us get an inside look like we’ve never had before. We’ll also be syndicating coverage from OpenCongress and elsewhere. In all of the articles we run, the focus will be on legislation and policy. We’ll be leaving politics and gossip to the old media.

In­sid­er is an open-access on­line news­pa­per, but one with some un­usu­al pages. For ar­ti­cles on many top­ics you’ll find on the right side a topic dash­board. There you can con­nect with other read­ers in a va­ri­ety of ways, such as a Q&A system, link submission, and a community notebook (a wiki).

In time we’ll be in­te­grat­ing In­sid­er more with GovTrack.​us so that if you’re track­ing a bill over on Gov­Track you’ll get a no­tice when­ev­er it’s men­tioned in an ar­ti­cle in GovTrack Insider.

I hope you’ll join us on this ex­per­i­ment. Happy read­ing.

(GovTrack Insider was built with the help of Josh Sulkin (of FlyOnTime.us fame).)

GAH09 Philadelphia Hackathon: New Jersey Gang Survey Viewer

Monday, December 14th, 2009

The New Jersey Gang Survey Viewer is a visualization tool for the New Jersey State Police Street Gang Survey 2007. The site was developed by five volunteers over this past weekend, with the help of New Jersey State Police analysts. Our goal was to elevate public knowledge about street gang presence in New Jersey, USA, based on the NJSP’s 2007 survey of the 560+ municipalities in the state. The NJSP analysts approached me shortly before our Great American Hackathon meet-up in Philadelphia was to occur, and our group eagerly saw this project as a great way to work with a government data provider on an app that we think will elevate public knowledge in an important area.

(more…)

Open Government Directive Evaluation on Principles

Wednesday, December 9th, 2009

Yesterday the White House’s chief technologists unveiled the Open Government Directive (OGD). The OGD mainly covers two aspects of government transparency: using technology as a tool for data sharing and public participation in agency decision-making. We’ve seen the start of a culture shift this year in the executive branch, in parallel with actual progress throughout the country, and now the OGD outlines and codifies a vision for the next four months and, well, beyond.

Last week I reviewed the House’s Statement of Disbursement electronic document along the dimensions of open government data (which unfortunately has the same abbreviation). The OGD talks about how agencies should go about the process of opening data. Here is a review of what the OGD says, organized by open data principle.

I’ll put the conclusion up front: The OGD addresses nearly all of the open government data principles that have been put forward, and even adds two of its own: being pro-active about data release and creating accountability by designating an official responsible for data quality (more on these at the end). So from this perspective, the OGD is pretty spot-on. It is very strong in public input, public review, and interagency coordination, which are normally the weakest spots of government data (but, on the other hand, this isn’t data, this is a goal, so the proof will be in the pudding). It could have been stronger in the areas of machine processability & promoting analysis, and explaining what is appropriate for data licensing (ideally, none).

Here are the details:

Information is not meaningfully public if it is not available on the Internet for free.

The OGD says “each agency shall take prompt steps to expand access to information by making it available online in open formats.” The OGD itself doesn’t say free, but executive branch policy already requires that public information not be sold to the public at more than the marginal cost of distribution — which is about as good as one might expect. So we’ll count this principle as asserted by the OGD.

— GOOD

Data Should Be Primary. Primary data is data as collected at the source, with the finest possible level of granularity, not in aggregate or modified forms.

In the OGD’s appendix where it outlines further details for agencies, it says agencies should release data “as granular as possible”.

— GOOD

Timely.

The OGD days, “Timely publication of information is an essential component of transparency. Delays should not be viewed as an inevitable and insurmountable consequence of high demand.”

— GOOD

Accessible. Data are available to the widest range of users for the widest range of purposes, meaning use an open standard, with a bulk download, and with documentation.

Machine processable: Data are reasonably structured to allow automated processing.

The OGD specifically defines “open format” — which is the subject of the directive — as something that is platform independent and machine readable. Now, here the OGD slips a little because it redefines “open” but actually leaves out open standards. I don’t think that was intentional, so we’ll give the OGD credit for mentioning open standards even though it didn’t exactly. It mentions “downloadable” but not in bulk, and there is no mention of documentation in the OGD. We can’t tell what the OGD meant by “machine readable” — I think of this term now a sloppy form of “machine processable”. It would have helped if the OGD specifically noted that the point is to support analysis and reuse of the data.

I used to use “machine readable” until someone corrected me that really any format can be read by a machine. The question is what the machine can do with it: to what degree can the data be meaningfully processed by a machine? So now I use machine-processable.

– WEAK

Non-discriminatory: Data are available to anyone, with no requirement of registration.
Non-proprietary: Data are available in a format over which no entity has exclusive control.
License-free. Dissemination of the data is not limited by intellectual property law or other terms.

The OGD says data must be “made available to the public without restrictions that would impede the re-use of that information.” Here we could have really benefited from some simple but concrete guidance.

— WEAK

Promote analysis: Data published by the government should be in formats and approaches that promote analysis and reuse of that data.

There is a sense in which this is implicit in the OGD, but maybe it is the goggles through which I read it. The OGD fails to say explicitly that analysis is the whole point of open government data.

— FAIL

Public input: The public is in the best position to determine what information technologies will be best suited for the applications the public intends to create for itself.

Public review: There should be a means for the public to interact with the data publisher during and after the data has been made. The public may have questions or may find errors. The process of creating the data should also be transparent.

These principles are perhaps the least commonly addressed, and yet it is one of the most prominent aspects of the OGD. The OGD requires agencies to allow the public to give feedback on data quality, data prioritization, and other aspects of the agency’s OGD plan. In fact, the OGD says, “Each agency shall respond to public input received on its Open Government Webpage on a regular basis.”

In addition, the OGD will form a working group (described next) that will discuss “ideas to promote
participation and collaboration, including how to … take advantage of the expertise and insight of people both inside and outside the Federal Government, and form high-impact collaborations with researchers, the private sector, and civil society.”

In the appendix where it outlines further goals for agencies, the OGD says, “Your agency should also identify key audiences for its information and their needs, and endeavor to publish high-value information for each of those audiences in the most accessible forms and formats.”

— EXCELLENT

Interagency coordination: Interoperability makes data more valuable by making it easier to derive new uses from combinations of data. To the extent two data sets refer to the same kinds of things, the creators of the data sets should strive to make them interoperable.

The OGD will establish a working group lead by the Deputy Director for Management at OMB, the Federal Chief Information Officer, and the Federal Chief Technology Officer to provide a forum to share best practices for data collection, aggregation, validation, and dissemination throughout the government, to coordinate implementations of federal spending transparency, and to provide a forum for sharing best practices for participation.

— EXCELLENT

Provenance and trust: Published content should be digitally signed or include attestation of publication/creation date, authenticity, and integrity.

Permanent Web Address: The file should have a stable location.

Safe file formats: Government bodies publishing data online should always seek to publish using data formats that do not include executable content.

Globally Unique Identifiers: This concept, important on the world wide web, is that any document, resource, data record, or entity mentioned in a database, or some might say every paragraph in a document, should have a unique identification that others can use to point to or cite it elsewhere.

Linked Open Data: This is a method for publishing databases in a standard format for interconnectivity with other databases without the expense of wide agreement on unified inter-agency or global data standards.

These get into some of the more precise details of data format. I might have liked to see provenance & trust addressed, but I am not sure whether I would really expect these principles to be included in a high level 120-day plan, at least not at this point. So their absence is not something I hold against the OGD. Still:

— FAIL

Other Notes

The OGD talks about being proactive with data release.

The OGD also adds accountability: “Each agency … shall designate a high-level senior official to be accountable for the quality and objectivity of, and internal controls over, the Federal spending information publicly disseminated.”

Congressional Disbursements Data Release Evaluation

Tuesday, December 1st, 2009

This week the House of Representatives began publishing disbursements online. Disbursements include how much congressmen and their staffs are paid, what kinds of expenses they have, and who they are paying for these services. This is a really great case study in how to do transparency. There are a lot of wins here and several points to learn from.

The best thing I see so far is the documentation provided on disbursements.house.gov. There is a nice explanation of the reporting process, a FAQ, and a glossary. There is also a table of transaction codes found in the document, and these all are crucial for anyone reading or analyzing the information. This is one of the best examples of documentation I’ve seen for government data of this kind.

Here’s an evaluation of the disclosure based on standards I’ve drawn from others and outlined here.

In summary, many of the goals are met. The important ones not met are machine processability, public input, and public review. Machine processability is a very important one and the fact that this goal was not met seriously undermines much of the reason for publishing the information in the first place.

Information is not meaningfully public if it is not available on the Internet for free.

— GOAL ACHIEVED.

Data Should Be Primary. Primary data is data as collected at the source, with the finest possible level of granularity, not in aggregate or modified forms.

We can evaluate this because the documentation actually describes how the House Clerk receives the information. It talks about some degree of aggregation taking place, such as a $10,000 travel record not necessarily being for one trip. But by and large we’re seeing the level of detail that I think is expected.

– GOAL ACHIEVED

Timely.

The SOD is published quarterly, and hopefully this will include the online/electronic version going forward. I don’t think we can have too much higher of an expectation here. Ten years from now perhaps I would like to see real-time expense reporting, but not today.

– TO BE SEEN (if the electronic version is published as often as the print version in the future)

Accessible. Data are available to the widest range of users for the widest range of purposes.

This goal from the “8 Principles” refers to:

Data Format: PDF is an open standard, and therefore a good choice among the data formats for the purpose of publishing a document. (See more below.)

Bulk Data: The 3,400-page document is provided in a single PDF file, rather than hundreds or thousands of separate downloads. This satisfies the goal of bulk data.

Documentation: Documentation is excellent. There is an explanation of the reporting process, a FAQ, a table of codes, and a glossary.

– GOAL ACHIEVED

Machine processable: Data are reasonably structured to allow automated processing.

This is the first goal which is not addressed at all by the data release. While PDF is good for documents, it is bad for tabular information. It does not support sorting, transforming, or other analysis, and it only marginally supports search. A spreadsheet format of any sort would be useful here, some formats better than others.

Considering the size of this data set, without the help of computers to process this information it is far less useful than it could be. To be given a barely-searchable 3,000-page file is only a small step up from being mailed several reams of paper.

Furthermore, there is no indication on the disbursements website that this will be considered in the future.

– GOAL NOT MET

Non-discriminatory: Data are available to anyone, with no requirement of registration.
Non-proprietary: Data are available in a format over which no entity has exclusive control.
License-free. Dissemination of the data is not limited by intellectual property law or other terms.

– GOALS ACHIEVED

Promote analysis: Data published by the government should be in formats and approaches that promote analysis and reuse of that data.

This goal (and the next two) comes from the Association of Computing Machinery’s Recommendation on Open Government and is similar to the Machine Processable goal above. So, see above.

– GOAL NOT MET

Safe file formats: Government bodies publishing data online should always seek to publish using data formats that do not include executable content.

PDF is, relatively speaking, a safe file format.

– GOAL ACHIEVED

Provenance and trust: Published content should be digitally signed or include attestation of publication/creation date, authenticity, and integrity.

According to the disbursements website, the files are digitally signed. I haven’t verified that the signature process was done correctly.

– GOAL ACHIEVED

Public input: The public is in the best position to determine what information technologies will be best suited for the applications the public intends to create for itself.

I am sure members of our community have been in touch with the Clerk’s office. However, there was no public discussion on how these files ought to have been made available, and therefore I am going to not count this goal as having been met.

– GOAL NOT MET

Public review: There should be a means for the public to interact with the data publisher during and after the data has been made. The public may have questions or may find errors. The process of creating the data should also be transparent.

This is a goal rarely given any attention. The documentation gives significant insight into this process. But there is no contact person for this data set that is made known to the public.

– GOAL NOT MET

Interagency coordination: Interoperability makes data more valuable by making it easier to derive new uses from combinations of data. To the extent two data sets refer to the same kinds of things, the creators of the data sets should strive to make them interoperable.

There is a potential to link the names of Members of Congress to their ID numbers provided in, say, the XML voting records.

– GOAL NOT MET

Permanent Web Address: The file should have a stable location.

– GOAL ACHIEVED (provided it is kept there)

Globally Unique Identifiers: This concept, important on the world wide web, is that any document, resource, data record, or entity mentioned in a database, or some might say every paragraph in a document, should have a unique identification that others can use to point to or cite it elsewhere.

– GOAL NOT MET

Linked Open Data: This is a method for publishing databases in a standard format for interconnectivity with other databases without the expense of wide agreement on unified inter-agency or global data standards.

– GOAL NOT MET

Congress, Cosponsor These Bills

Tuesday, November 17th, 2009

Lots of bills have come up on the Open House Project mail list that we’d label “a good thing”. There are also a few other great ones coming from the GOP. Here’s the roster of bills I’ve jotted down. (If you’re a staffer who’s annoyed I left out your bill, do let me know.)

E-Filing & Disclosure

S. 482: Senate Campaign Disclosure Parity Act
A bill to require Senate candidates to file designations, statements, and reports in electronic form, by Feingold with 41 cosponsors.
http://blog.sunlightfoundation.com/2009/02/27/senate-e-filing-bill-reintroduced-pass-s-482/

H.R. 682: Stop Trading on Congressional Knowledge Act
To prohibit securities and commodities trading based on nonpublic information relating to Congress, and to require additional reporting by Members and employees of Congress of securities transaction, and for other purposes, by Rep Baird.
http://blog.sunlightfoundation.com/2009/11/23/lawmaker-investments-and-disclosure/
(added to this list on Nov 23)

CRS Reports

S. Res. 118: A resolution to provide Internet access to certain Congressional Research Service publications
H.R. 3762: Congressional Research Service Electronic Accessibility Act of 2009
To provide members of the public with Internet access to certain Congressional Research Service publications, and for other purposes, by Lieberman with 6 cosponsors and Rep. Frank Kratovil with 2 cosponsors.
http://cdt.org/issue/congressional-research-service

Earmarks

S. Res. 63: A resolution to amend the Standing Rules of the Senate to ensure that all congressionally directed spending items in appropriations and authorization legislation fall under the oversight and transparency provisions of S. 1, the Honest Leadership and Open Government Act of 2007.
By Claire McCaskill with 1 cosponsor.

H. Res. 440: Amending the Rules of the House of Representatives to strengthen the public disclosure of all earmark requests.
By Bill Cassidy with 15 cosponsors.

H.R. 3268: Earmark Transparency and Accountability Reform Act
To amend the Rules of the House of Representatives and the Congressional Budget and Impoundment Control Act of 1974 to increase earmark transparency and accountability, and for other purposes, by Dave Reichert with 1 cosponsor.
(In fairness, this is a big bill and I haven’t studied it closely so I wouldn’t be quick to label it “a good thing” without a review.)

Read The Bill

H. Res. 554: Amending the Rules of the House of Representatives to require that legislation and conference reports be available on the Internet for 72 hours before consideration by the House, and for other purposes.
By Brian Baird with 214 cosponsors.
http://readthebill.org/
http://blog.sunlightfoundation.com/2009/07/09/read-the-bill-the-long-short-story/

H. Res. 835: Amending the rules of the House of Representatives to provide for transparency in the committee amendment process.
By Lynn Jenkins with 110 cosponsors.

Committee Video

H. Res. 869: Directing the Chief Administrative Officer to install cameras in the hearing room of the Committee on Rules.
By Charles Dent with 77 cosponsors.

Possible Google philanthropic project for transparency

Friday, September 25th, 2009

Google has a philanthropic project underway that might be relevant to us. They are in the process of choosing which projects to pursue, determined by a vote:

http://www.project10tothe100.com/vote.html

Last fall we launched Project 10^100, a call for ideas to change the world by helping as many people as possible. Your response was overwhelming.

Make government more transparent

Create a website that enables people from any country or municipality to easily learn about the workings of their government, and rally their fellow citizens to take action to improve it. Numerous user ideas embraced variations on the theme of governmental transparency, with specific proposals ranging from publishing details of proposed laws and politicians’ voting records to making public budgets searchable online and leveraging social networks to let communities make their voices heard by their representatives by voting on pressing issues.

Suggestions that inspired this idea

1. Create a “govwatch” program that allows people to enter in geographic and other info and get back information about bills/laws that affect them

2. Empower individual voters with both online, real-time data on their political representatives’ activity, and tools to analyze, engage and influence outcomes

3. Increase the transparency of laws, eliminate duplicate ones and communicate them better to affected citizens

4. Share information on how municipalities and states use public funds

Have we forgotten how to have an opinion and still be fair?

Wednesday, September 9th, 2009

Maybe it was never true, but I have this sense that we’ve lost something in American public discourse over the last century. We’ve lost the conception of having an opinion and still being fair. It’s like we just can’t imagine both being true in the same brain. After watching the President’s speech tonight I realized that I feel seriously inhibited in what I say publicly because I want to maintain an impartial image so that people see GovTrack as an impartial source. Am I over concerned? I doubt it. This mistaken concept also underlies “professional journalism”, which is the style of most news operations now, and I think is perhaps the second greatest contributing factor to the downfall of news (after “The Internet”). More on that below.

People often mistake me as a liberal. And others mistake me as a conservative. Here’s a story about someone that did both. I’ve gotten some amusing feedback from people who mistook my GovTrack experiment in collaborative letter writing, for which I delievered an anti-gun-control letter to congressmen, as representing my own views. That couldn’t be further from the truth. If it were up to me, guns would be illegal. I explained this contradiction to someone who wrote me a letter. He said:

Julles: But don’t you see the similarities between what this administration is doing and what was done in Germany in the 30’s?

Then I replied:

Me: I really get personally offended sometimes. To compare a president who is trying to improve health care to a regime that killed however many millions is to belittle the damage and suffering done to anyone that experienced it. Disagree on policy all you want, but don’t belittle one of the world’s greatest tragedies.

And he replied:

Julles: HR3200 is a BAD bill . . . Open your eyes, kid.

(H.R. 3200 is the health care bill.) Expressions about eyes always strike a chord with me. But more to the point, I never told this guy I thought H.R. 3200 was a good bill. And, quite honestly, after the President’s speech tonight, I am not so enamored by where health care reform is going. In particular I wonder about the constitutional authority to require everyone to possess health insurance. I suspect it will be turned into a tax penalty to avoid a straightforward law and side-step constitutional questions.

I don’t have an agenda. But if I have an opinion, I may jeopardize the perception of fairness and accuracy in anything I do in the world of civics. Can I have an opinion and still be trusted to be fair when I put my nonpartisan hat on? I’m not even partisan. I vote Democratic, but so does most everyone else in the places I’ve ever lived. Am I allowed to say that? Have I lost credibility merely for being more open about my views?

And this is what I imagine journalists go through. They vote too, I hope. If they write for the New York Times, they probably live in New York and vote like most New Yorkers. But then they turn off their passion when they put their fingers down to the newsroom keyboard. And we suspend disbelief for a moment as we read their articles that journalists can’t have opinions and be fair at the same time. They make it easy for us to suspend disbelief because they write like they’re dead. No interest in the outcome. They’ve got to write a few words because they need to pay for the electricity that keeps their computers going, but if newspapers paid them to write a summary of the tax law they’d do that too. It doesn’t matter to them, at least as far as we can tell from reading.

This is ridiculous and, worse, counterproductive. I’d be more interested in news if articles pleaded with me that the issue was important, that it isn’t a conceptual exercise but that it even matters to the reporter. This is, apparently, how news used to be 100 to 250 years ago. It’s how the most compelling documentaries and long-form video news segments are today. Of course, it was also not very reliable 100-250 years ago. But I don’t think that dichotomy has to be so today. If we opened ourselves up to the idea that a reporter could have an opinion and still be fair, we wouldn’t need to suspend disbelief. Reporters wouldn’t have to die each time they start writing the next piece.

I don’t want reporters to die. Save the reporters. (Ironic hyperbole.)

GSA social media TOS review

Friday, May 29th, 2009

Recently the GSA has been negotiating on behalf of federal agencies special Terms of Service agreements with various social media services like YouTube to allow agencies to make use of these services — some of those agreements are now publicly available. My understanding is that GSA’s negotiations were necessary before agencies could use these services because of legal issues like liability. I’ve reviewed the TOS’s to see whether they address open-government concerns.

The use of non-governmental services like these as part of a governmental function raises several openness issues, which we rehashed in an earlier thread on the use of YouTube by Congress. To summarize, the issues include:

  • whether the service provider meets government web standards including accessibility, privacy, security, nondiscrimination, and archival access to media
  • whether the service providers require members of the public to enter into a contractual agreement with them (i.e. more terms of service) in order to access government content, what the public must agree to, and whether these additional terms with the public restrict what the public can do with and how the public can share government media obtained through the service
  • whether use of the service constitutes an endorsement of a particular brand or technology, or if it provides a significant business advantage to a profit-seeking entity
  • whether the service provides government media in a data format that does not impose technical and legal restrictions on users of the media

I think I need to include a special note about privacy. We can expect that any non-governmental site is going to track their users’ behavior as best they can because of the financial incentives of user-targeted advertising and selling demographic data. I don’t know to what extent any of the services below make use of this, but I expect that this is a major component of the revenue of all of them.

The GSA TOS are amendments to the standard TOS employed by the services. I haven’t read through any of the standard TOS, so of course I might be missing something.

I reviewed the TOS’s with respect to these issues. They had common elements.

No Advertising: The service agrees to not place advertisements on pages with government content (i.e. government “channels” and the like). This addresses part of the concern of endorsement. Of course, services may continue to display their brand name and link users to other parts of the service, so they are still able to promote their business.

No Cookies: The service agrees to not set cookies when a widget is placed *in* a government agency webpage. This means that the service gives up its ability to do the most advanced user tracking in the cases where the user may be unaware that they are even accessing a non-governmental service. The service may still track accesses by IP address which still may provide a more rudimentary means to track users but is more likely to be anonymous.

Closed Captioning: The service will provide the ability for government media to include closed-captioning for videos using industry standard practices, which is of course important for accessibility.

The TOS are linked from here:
https://forum.webcontent.gov/Default.asp?page=TOS_agreements

Here are reviews of each TOS:

AddThis.com

AddThis.com provides a “Social Bookmark & Feed Button Builder”. It’s a widget developers can put on their websites to help users share content on other social sites like Facebook. Because of what AddThis does, there are only a few concerns to be addressed. The TOS addresses both main concerns:

- No Advertising — on government “channels” on the AddThis site
(doesn’t seem to actually apply to AddThis).
- No Cookies — when placed on .gov/.mil websites.

https://forum.webcontent.gov/resource/resmgr/terms_of_service_w_socmed/addthistos_4.30.09_final_uns.pdf

Blip.tv

Blip.tv is a video hosting website, like YouTube. All of the privacy concerns above are relevant. The TOS includes:

- No Cookies — Blip.tv will allow the government to disable parts of its embeddable player that sets cookies.
- Closed Captioning.

Other aspects require some elaboration:

Ads-

The GSA TOS has a confusing section on advertising:

“Blip.tv reserves the right to run advertisements on any page on Blip.tv, but will not run advertisements in-stream or directly adjacent to user videos without the opt-in of the user who uploaded the video.”

It sounds like Blip.tv can place ads on the page, just not directly adjacent to or inside of government media.

Privacy-

The GSA TOS say explicitly that Blip.tv does not collect personally identifiable information about users, but does collect and use demographic data for targeted advertising. Users should expect to be asked demographic data.

https://forum.webcontent.gov/resource/resmgr/Docs/Blip_tv_-_Terms_of_Use_Agree.doc

Blist.com

This is a “social data discovery” tool where users can upload tabular data sets to share. I am actually going to skip a review of this TOS because I expect (or sincerely hope) that tabular data sets are shared with the public directly (as a bulk data download) besides through social tools.

Facebook

Facebook is a social networking site.

The negotiated TOS are not available to the public. Given the likely pervasiveness of the use of all of these tools in the future, it would be a shame if the GSA is facilitating government agencies’ use of third party services that violate the public’s expectations for government web content.

Flickr

Flickr is a photo sharing website. All of the concerns listed above are relevant to Flickr. There are no provisions in the TOS relevant to this review.

https://forum.webcontent.gov/resource/resmgr/Docs/Flickr_TOS_Agreement_Amended.doc

MySpace

MySpace is a social networking site.

The negotiated TOS are not available to the public. Given the likely pervasiveness of the use of all of these tools in the future, it would be a shame if the GSA is facilitating government agencies’ use of third party services that violate the public’s expectations for government web content.

SlideShare

SlideShare is a document (i.e. presentations) sharing tool. The TOS are not posted on the GSA website, but this appears to be a publishing mistake as it notes that the TOS are intended to be publicly available.

Vimeo

Vimeo is a video sharing tool. The TOS has one relevant provision, despite all of the concerns being relevant.

- No Advertising — on government “channels” on the Vimeo site

https://forum.webcontent.gov/resource/resmgr/terms_of_service_w_socmed/vimeo_tos_final_april2009.doc

YouTube

YouTube is a video sharing site.

The negotiated TOS are not available to the public. Given the likely pervasiveness of the use of all of these tools in the future, it would be a shame if the GSA is facilitating government agencies’ use of third party services that violate the public’s expectations for government web content.

Conclusion

While I am encouraged by the GSA’s forward thinking to make use of the latest technologies developed in the private sector, I believe that working with the private sector poses a number of risks to government data, to the public’s privacy and free speech rights, and to good governance. These risks can be minimized and some useful provisions have been included in the negotiated TOS’s along these lines, but far more careful thinking is necessary.

While several of the TOS addressed accessibility and privacy concerns, none of the TOS addressed security, nondiscrimination, archival access to media, the TOS the public are required to enter into to access government content through these services, and web media data formats.

Update: See also this related post.