GovTrack Insider

January 24th, 2010

Last week I launched GovTrack Insider, a spin-off of GovTrack.us that complements GovTrack with original and syndicated reporting of the U.S. Congress. I’m sending reporters to congressional committee markups to get as early of a peek into the public component of the legislative process as we can.

After five years of run­ning GovTrack.​us, I saw a need to move be­yond data. As we’ve seen over at the OpenCongress.org blog, reporting is an important part of following the legislative process, staying ahead of the game, and understanding the ramifications of procedural events. GovTrackInsider.​com puts a focus on grass-roots reporting of the legislative process.

GovTrack Insider has hired a small team of paid free­lance re­porters to cover con­gres­sion­al com­mit­tee meet­ings. This is the ear­li­est point in the leg­isla­tive pro­cess that we can ob­serve di­rect­ly, and it’s going to help us get an inside look like we’ve never had before. We’ll also be syndicating coverage from OpenCongress and elsewhere. In all of the articles we run, the focus will be on legislation and policy. We’ll be leaving politics and gossip to the old media.

In­sid­er is an open-access on­line news­pa­per, but one with some un­usu­al pages. For ar­ti­cles on many top­ics you’ll find on the right side a topic dash­board. There you can con­nect with other read­ers in a va­ri­ety of ways, such as a Q&A system, link submission, and a community notebook (a wiki).

In time we’ll be in­te­grat­ing In­sid­er more with GovTrack.​us so that if you’re track­ing a bill over on Gov­Track you’ll get a no­tice when­ev­er it’s men­tioned in an ar­ti­cle in GovTrack Insider.

I hope you’ll join us on this ex­per­i­ment. Happy read­ing.

(GovTrack Insider was built with the help of Josh Sulkin (of FlyOnTime.us fame).)

GAH09 Philadelphia Hackathon: New Jersey Gang Survey Viewer

December 14th, 2009

The New Jersey Gang Survey Viewer is a visualization tool for the New Jersey State Police Street Gang Survey 2007. The site was developed by five volunteers over this past weekend, with the help of New Jersey State Police analysts. Our goal was to elevate public knowledge about street gang presence in New Jersey, USA, based on the NJSP’s 2007 survey of the 560+ municipalities in the state. The NJSP analysts approached me shortly before our Great American Hackathon meet-up in Philadelphia was to occur, and our group eagerly saw this project as a great way to work with a government data provider on an app that we think will elevate public knowledge in an important area.

Read the rest of this entry »

Open Government Directive Evaluation on Principles

December 9th, 2009

Yesterday the White House’s chief technologists unveiled the Open Government Directive (OGD). The OGD mainly covers two aspects of government transparency: using technology as a tool for data sharing and public participation in agency decision-making. We’ve seen the start of a culture shift this year in the executive branch, in parallel with actual progress throughout the country, and now the OGD outlines and codifies a vision for the next four months and, well, beyond.

Last week I reviewed the House’s Statement of Disbursement electronic document along the dimensions of open government data (which unfortunately has the same abbreviation). The OGD talks about how agencies should go about the process of opening data. Here is a review of what the OGD says, organized by open data principle.

I’ll put the conclusion up front: The OGD addresses nearly all of the open government data principles that have been put forward, and even adds two of its own: being pro-active about data release and creating accountability by designating an official responsible for data quality (more on these at the end). So from this perspective, the OGD is pretty spot-on. It is very strong in public input, public review, and interagency coordination, which are normally the weakest spots of government data (but, on the other hand, this isn’t data, this is a goal, so the proof will be in the pudding). It could have been stronger in the areas of machine processability & promoting analysis, and explaining what is appropriate for data licensing (ideally, none).

Here are the details:

Information is not meaningfully public if it is not available on the Internet for free.

The OGD says “each agency shall take prompt steps to expand access to information by making it available online in open formats.” The OGD itself doesn’t say free, but executive branch policy already requires that public information not be sold to the public at more than the marginal cost of distribution — which is about as good as one might expect. So we’ll count this principle as asserted by the OGD.

— GOOD

Data Should Be Primary. Primary data is data as collected at the source, with the finest possible level of granularity, not in aggregate or modified forms.

In the OGD’s appendix where it outlines further details for agencies, it says agencies should release data “as granular as possible”.

— GOOD

Timely.

The OGD days, “Timely publication of information is an essential component of transparency. Delays should not be viewed as an inevitable and insurmountable consequence of high demand.”

— GOOD

Accessible. Data are available to the widest range of users for the widest range of purposes, meaning use an open standard, with a bulk download, and with documentation.

Machine processable: Data are reasonably structured to allow automated processing.

The OGD specifically defines “open format” — which is the subject of the directive — as something that is platform independent and machine readable. Now, here the OGD slips a little because it redefines “open” but actually leaves out open standards. I don’t think that was intentional, so we’ll give the OGD credit for mentioning open standards even though it didn’t exactly. It mentions “downloadable” but not in bulk, and there is no mention of documentation in the OGD. We can’t tell what the OGD meant by “machine readable” — I think of this term now a sloppy form of “machine processable”. It would have helped if the OGD specifically noted that the point is to support analysis and reuse of the data.

I used to use “machine readable” until someone corrected me that really any format can be read by a machine. The question is what the machine can do with it: to what degree can the data be meaningfully processed by a machine? So now I use machine-processable.

– WEAK

Non-discriminatory: Data are available to anyone, with no requirement of registration.
Non-proprietary: Data are available in a format over which no entity has exclusive control.
License-free. Dissemination of the data is not limited by intellectual property law or other terms.

The OGD says data must be “made available to the public without restrictions that would impede the re-use of that information.” Here we could have really benefited from some simple but concrete guidance.

— WEAK

Promote analysis: Data published by the government should be in formats and approaches that promote analysis and reuse of that data.

There is a sense in which this is implicit in the OGD, but maybe it is the goggles through which I read it. The OGD fails to say explicitly that analysis is the whole point of open government data.

— FAIL

Public input: The public is in the best position to determine what information technologies will be best suited for the applications the public intends to create for itself.

Public review: There should be a means for the public to interact with the data publisher during and after the data has been made. The public may have questions or may find errors. The process of creating the data should also be transparent.

These principles are perhaps the least commonly addressed, and yet it is one of the most prominent aspects of the OGD. The OGD requires agencies to allow the public to give feedback on data quality, data prioritization, and other aspects of the agency’s OGD plan. In fact, the OGD says, “Each agency shall respond to public input received on its Open Government Webpage on a regular basis.”

In addition, the OGD will form a working group (described next) that will discuss “ideas to promote
participation and collaboration, including how to … take advantage of the expertise and insight of people both inside and outside the Federal Government, and form high-impact collaborations with researchers, the private sector, and civil society.”

In the appendix where it outlines further goals for agencies, the OGD says, “Your agency should also identify key audiences for its information and their needs, and endeavor to publish high-value information for each of those audiences in the most accessible forms and formats.”

— EXCELLENT

Interagency coordination: Interoperability makes data more valuable by making it easier to derive new uses from combinations of data. To the extent two data sets refer to the same kinds of things, the creators of the data sets should strive to make them interoperable.

The OGD will establish a working group lead by the Deputy Director for Management at OMB, the Federal Chief Information Officer, and the Federal Chief Technology Officer to provide a forum to share best practices for data collection, aggregation, validation, and dissemination throughout the government, to coordinate implementations of federal spending transparency, and to provide a forum for sharing best practices for participation.

— EXCELLENT

Provenance and trust: Published content should be digitally signed or include attestation of publication/creation date, authenticity, and integrity.

Permanent Web Address: The file should have a stable location.

Safe file formats: Government bodies publishing data online should always seek to publish using data formats that do not include executable content.

Globally Unique Identifiers: This concept, important on the world wide web, is that any document, resource, data record, or entity mentioned in a database, or some might say every paragraph in a document, should have a unique identification that others can use to point to or cite it elsewhere.

Linked Open Data: This is a method for publishing databases in a standard format for interconnectivity with other databases without the expense of wide agreement on unified inter-agency or global data standards.

These get into some of the more precise details of data format. I might have liked to see provenance & trust addressed, but I am not sure whether I would really expect these principles to be included in a high level 120-day plan, at least not at this point. So their absence is not something I hold against the OGD. Still:

— FAIL

Other Notes

The OGD talks about being proactive with data release.

The OGD also adds accountability: “Each agency … shall designate a high-level senior official to be accountable for the quality and objectivity of, and internal controls over, the Federal spending information publicly disseminated.”

Congressional Disbursements Data Release Evaluation

December 1st, 2009

This week the House of Representatives began publishing disbursements online. Disbursements include how much congressmen and their staffs are paid, what kinds of expenses they have, and who they are paying for these services. This is a really great case study in how to do transparency. There are a lot of wins here and several points to learn from.

The best thing I see so far is the documentation provided on disbursements.house.gov. There is a nice explanation of the reporting process, a FAQ, and a glossary. There is also a table of transaction codes found in the document, and these all are crucial for anyone reading or analyzing the information. This is one of the best examples of documentation I’ve seen for government data of this kind.

Here’s an evaluation of the disclosure based on standards I’ve drawn from others and outlined here.

In summary, many of the goals are met. The important ones not met are machine processability, public input, and public review. Machine processability is a very important one and the fact that this goal was not met seriously undermines much of the reason for publishing the information in the first place.

Information is not meaningfully public if it is not available on the Internet for free.

— GOAL ACHIEVED.

Data Should Be Primary. Primary data is data as collected at the source, with the finest possible level of granularity, not in aggregate or modified forms.

We can evaluate this because the documentation actually describes how the House Clerk receives the information. It talks about some degree of aggregation taking place, such as a $10,000 travel record not necessarily being for one trip. But by and large we’re seeing the level of detail that I think is expected.

– GOAL ACHIEVED

Timely.

The SOD is published quarterly, and hopefully this will include the online/electronic version going forward. I don’t think we can have too much higher of an expectation here. Ten years from now perhaps I would like to see real-time expense reporting, but not today.

– TO BE SEEN (if the electronic version is published as often as the print version in the future)

Accessible. Data are available to the widest range of users for the widest range of purposes.

This goal from the “8 Principles” refers to:

Data Format: PDF is an open standard, and therefore a good choice among the data formats for the purpose of publishing a document. (See more below.)

Bulk Data: The 3,400-page document is provided in a single PDF file, rather than hundreds or thousands of separate downloads. This satisfies the goal of bulk data.

Documentation: Documentation is excellent. There is an explanation of the reporting process, a FAQ, a table of codes, and a glossary.

– GOAL ACHIEVED

Machine processable: Data are reasonably structured to allow automated processing.

This is the first goal which is not addressed at all by the data release. While PDF is good for documents, it is bad for tabular information. It does not support sorting, transforming, or other analysis, and it only marginally supports search. A spreadsheet format of any sort would be useful here, some formats better than others.

Considering the size of this data set, without the help of computers to process this information it is far less useful than it could be. To be given a barely-searchable 3,000-page file is only a small step up from being mailed several reams of paper.

Furthermore, there is no indication on the disbursements website that this will be considered in the future.

– GOAL NOT MET

Non-discriminatory: Data are available to anyone, with no requirement of registration.
Non-proprietary: Data are available in a format over which no entity has exclusive control.
License-free. Dissemination of the data is not limited by intellectual property law or other terms.

– GOALS ACHIEVED

Promote analysis: Data published by the government should be in formats and approaches that promote analysis and reuse of that data.

This goal (and the next two) comes from the Association of Computing Machinery’s Recommendation on Open Government and is similar to the Machine Processable goal above. So, see above.

– GOAL NOT MET

Safe file formats: Government bodies publishing data online should always seek to publish using data formats that do not include executable content.

PDF is, relatively speaking, a safe file format.

– GOAL ACHIEVED

Provenance and trust: Published content should be digitally signed or include attestation of publication/creation date, authenticity, and integrity.

According to the disbursements website, the files are digitally signed. I haven’t verified that the signature process was done correctly.

– GOAL ACHIEVED

Public input: The public is in the best position to determine what information technologies will be best suited for the applications the public intends to create for itself.

I am sure members of our community have been in touch with the Clerk’s office. However, there was no public discussion on how these files ought to have been made available, and therefore I am going to not count this goal as having been met.

– GOAL NOT MET

Public review: There should be a means for the public to interact with the data publisher during and after the data has been made. The public may have questions or may find errors. The process of creating the data should also be transparent.

This is a goal rarely given any attention. The documentation gives significant insight into this process. But there is no contact person for this data set that is made known to the public.

– GOAL NOT MET

Interagency coordination: Interoperability makes data more valuable by making it easier to derive new uses from combinations of data. To the extent two data sets refer to the same kinds of things, the creators of the data sets should strive to make them interoperable.

There is a potential to link the names of Members of Congress to their ID numbers provided in, say, the XML voting records.

– GOAL NOT MET

Permanent Web Address: The file should have a stable location.

– GOAL ACHIEVED (provided it is kept there)

Globally Unique Identifiers: This concept, important on the world wide web, is that any document, resource, data record, or entity mentioned in a database, or some might say every paragraph in a document, should have a unique identification that others can use to point to or cite it elsewhere.

– GOAL NOT MET

Linked Open Data: This is a method for publishing databases in a standard format for interconnectivity with other databases without the expense of wide agreement on unified inter-agency or global data standards.

– GOAL NOT MET

Congress, Cosponsor These Bills

November 17th, 2009

Lots of bills have come up on the Open House Project mail list that we’d label “a good thing”. There are also a few other great ones coming from the GOP. Here’s the roster of bills I’ve jotted down. (If you’re a staffer who’s annoyed I left out your bill, do let me know.)

E-Filing & Disclosure

H.R. 4858: The Public Online Information Act to establish an advisory committee to issue nonbinding governmentwide guidelines on making public information available on the Internet, to require publicly available Government information held by the executive branch to be made available on the Internet, to express the sense of Congress that publicly available information held by the legislative and judicial branches should be available on the Internet, and for other purposes.
By Steve Israel.
(Added to the list 3/25/2010.)

S. 482: Senate Campaign Disclosure Parity Act
A bill to require Senate candidates to file designations, statements, and reports in electronic form, by Feingold with 41 cosponsors.
http://blog.sunlightfoundation.com/2009/02/27/senate-e-filing-bill-reintroduced-pass-s-482/

H.R. 682: Stop Trading on Congressional Knowledge Act
To prohibit securities and commodities trading based on nonpublic information relating to Congress, and to require additional reporting by Members and employees of Congress of securities transaction, and for other purposes, by Rep Baird.
http://blog.sunlightfoundation.com/2009/11/23/lawmaker-investments-and-disclosure/
(added to this list on Nov 23)

CRS Reports

S. Res. 118: A resolution to provide Internet access to certain Congressional Research Service publications
H.R. 3762: Congressional Research Service Electronic Accessibility Act of 2009
To provide members of the public with Internet access to certain Congressional Research Service publications, and for other purposes, by Lieberman with 6 cosponsors and Rep. Frank Kratovil with 2 cosponsors.
http://cdt.org/issue/congressional-research-service

Earmarks

S. Res. 63: A resolution to amend the Standing Rules of the Senate to ensure that all congressionally directed spending items in appropriations and authorization legislation fall under the oversight and transparency provisions of S. 1, the Honest Leadership and Open Government Act of 2007.
By Claire McCaskill with 1 cosponsor.

H. Res. 440: Amending the Rules of the House of Representatives to strengthen the public disclosure of all earmark requests.
By Bill Cassidy with 15 cosponsors.

H.R. 3268: Earmark Transparency and Accountability Reform Act
To amend the Rules of the House of Representatives and the Congressional Budget and Impoundment Control Act of 1974 to increase earmark transparency and accountability, and for other purposes, by Dave Reichert with 1 cosponsor.
(In fairness, this is a big bill and I haven’t studied it closely so I wouldn’t be quick to label it “a good thing” without a review.)

Read The Bill

H. Res. 554: Amending the Rules of the House of Representatives to require that legislation and conference reports be available on the Internet for 72 hours before consideration by the House, and for other purposes.
By Brian Baird with 214 cosponsors.
http://readthebill.org/
http://blog.sunlightfoundation.com/2009/07/09/read-the-bill-the-long-short-story/

H. Res. 835: Amending the rules of the House of Representatives to provide for transparency in the committee amendment process.
By Lynn Jenkins with 110 cosponsors.

Committee Video

H. Res. 869: Directing the Chief Administrative Officer to install cameras in the hearing room of the Committee on Rules.
By Charles Dent with 77 cosponsors.

Congressional Committee Webcast Archives Review: I See Progress

November 10th, 2009

One of the continuing themes of the Open House and Open Senate Projects has been investigating what congressional committees make available to the public on their websites. I’ve recently become interested in committee markup meetings, and I was curious to see how often webcasts of markup meetings are available on committee websites.

In a survey of House and Senate standing committee websites this week, I found the following. Note that I counted hearings separately from markup sessions and business meetings.

Hearing Webcasts: The vast majority (32 of 35) of committees make archives of video webcasts of hearings regularly available on their websites, in what appeared to be a very timely way. The videos are pretty poor quality by today’s standards, but it’s still very useful. The exceptions were Senate Foreign Relations [UPDATE: I just missed the very obscure links. Nevermind. All Senate committees have video archives.], House Agriculture, House Appropriations, House Rules, and House Ways and Means which lacked video archives. House Agriculture makes it up by providing transcripts… after several months; the other four committees provide no electronic record of hearings.

It appeared that most also have live webcasts of most hearings, but I couldn’t tell just from looking at the websites once.

Hearing Transcripts: Transcripts were surprisingly hard to come by. Senate Armed Services, Senate Rules, Senate Veterans’ Affairs, and House Energy and Commerce seemed to be the only committees that provided  transcripts regularly. (That’s 4 of 35 committees.) Considering the importance of transcripts for disability accessibility and machine processing (e.g. search), this is too bad.

Hearing Prepared Testimony & Statements: PDFs and other document formats were used to post prepared statements and testimony — this almost makes up for not having transcripts. Four committees lacked even this, and of those none were among the committees that posted transcripts. House Appropriations and House Rules posted neither video, nor transcripts, nor prepared statements. The other two at least posted videos.

For hearings, by and large there is an electronic record available, and if you can find a record you can find video.

I counted markup sessions and business meetings separately from hearings. Electronic records were far less common for these meetings.

Markups: About half of the committees posted archival videos for these business meetings. Of those that didn’t, one posted transcripts. That leaves 18 out of 35 posting no electronic record of these meetings. The notable committee here is House Judiciary, which posts both transcripts and video of business meetings.

A similar survey for Senate committees was done just about a year ago by someone else on the OHP mail list who might want to remain anonymous on this point (I’m not sure). In comparison to that survey, more Senate committees are posting hearing archival video now, which is great. Less than half were regularly posting archival audio/video then, and now the vast majority are posting video. As for markups, just two of 16 Senate committees were posting recordings of markups regularly then, with a few more posting them irregularly, and some transcripts. So it is nice to see that Senate committees are moving more of this information online as well.

One note, some committees display a note at the starts of their videos: “The use of duplications of broadcast coverage of the Committee on Transportation is governed by the rules of the House. Use for political or commercial purposes is expressly prohibited.” I hope no one takes that message seriously, and I wonder what legal basis this message has. I don’t believe I am subject to the rules of the House.

This topic goes a long way back. See Carl Malamud’s work for more.

Additional notes:

Jim Snider replied to this on the OHP list saying that House Commerce had links to some webcasts which were not actually working, but noted that it was probably a glitch. He also wrote, “The last time I checked several years ago House Commerce Committee transcripts were running at least a year late and sometimes several years late.  The public record included in the transcripts also may not include follow-up correspondence on the public record between witnesses and the committee. In 1994 I wrote a master’s thesis on video access to public meetings, and in 1999 an op-ed in the Chicago Tribune, “Senate Hypocrisy Over “Hot” Testimony,” on how Congress inhibits public access to their public meeting video archives.”

Aphid pointed out a Metavid wiki page for congressional video availability. I seem to have duplicated some work, and I’ll need to check if I can update that page with anything new. UPDATE: Aphid also notes there that because many of the video streams are in a proprietary format, it may be illegal under the DMCA law to archive these videos. This along with the restrictions noted in House Rules is a major point to be addressed in the future.

UPDATE 2:
What can committees do going forward?

* For the sake of archives and use by professional journalists, provide a stream that is high-quality (it probably exists but just isn’t public).

* Similarly, provide the streams at least additionally in a format that does not make it a violation of federal law to copy (again, it’s a problem regardless of whether the committee says “go ahead”).

* Remove any additional assertions (e.g. House Rules) on how congressional video may be used. Either it is public or it is not. It is an affront to free speech if Congress thinks government records, of all things, should be off-limits to any part of public discourse.

* Partner with experts in the public — e.g. Aphid and Carl Malamud — on establishing goals for congressional video.

Possible Google philanthropic project for transparency

September 25th, 2009

Google has a philanthropic project underway that might be relevant to us. They are in the process of choosing which projects to pursue, determined by a vote:

http://www.project10tothe100.com/vote.html

Last fall we launched Project 10^100, a call for ideas to change the world by helping as many people as possible. Your response was overwhelming.

Make government more transparent

Create a website that enables people from any country or municipality to easily learn about the workings of their government, and rally their fellow citizens to take action to improve it. Numerous user ideas embraced variations on the theme of governmental transparency, with specific proposals ranging from publishing details of proposed laws and politicians’ voting records to making public budgets searchable online and leveraging social networks to let communities make their voices heard by their representatives by voting on pressing issues.

Suggestions that inspired this idea

1. Create a “govwatch” program that allows people to enter in geographic and other info and get back information about bills/laws that affect them

2. Empower individual voters with both online, real-time data on their political representatives’ activity, and tools to analyze, engage and influence outcomes

3. Increase the transparency of laws, eliminate duplicate ones and communicate them better to affected citizens

4. Share information on how municipalities and states use public funds

Have we forgotten how to have an opinion and still be fair?

September 9th, 2009

Maybe it was never true, but I have this sense that we’ve lost something in American public discourse over the last century. We’ve lost the conception of having an opinion and still being fair. It’s like we just can’t imagine both being true in the same brain. After watching the President’s speech tonight I realized that I feel seriously inhibited in what I say publicly because I want to maintain an impartial image so that people see GovTrack as an impartial source. Am I over concerned? I doubt it. This mistaken concept also underlies “professional journalism”, which is the style of most news operations now, and I think is perhaps the second greatest contributing factor to the downfall of news (after “The Internet”). More on that below.

People often mistake me as a liberal. And others mistake me as a conservative. Here’s a story about someone that did both. I’ve gotten some amusing feedback from people who mistook my GovTrack experiment in collaborative letter writing, for which I delievered an anti-gun-control letter to congressmen, as representing my own views. That couldn’t be further from the truth. If it were up to me, guns would be illegal. I explained this contradiction to someone who wrote me a letter. He said:

Julles: But don’t you see the similarities between what this administration is doing and what was done in Germany in the 30′s?

Then I replied:

Me: I really get personally offended sometimes. To compare a president who is trying to improve health care to a regime that killed however many millions is to belittle the damage and suffering done to anyone that experienced it. Disagree on policy all you want, but don’t belittle one of the world’s greatest tragedies.

And he replied:

Julles: HR3200 is a BAD bill . . . Open your eyes, kid.

(H.R. 3200 is the health care bill.) Expressions about eyes always strike a chord with me. But more to the point, I never told this guy I thought H.R. 3200 was a good bill. And, quite honestly, after the President’s speech tonight, I am not so enamored by where health care reform is going. In particular I wonder about the constitutional authority to require everyone to possess health insurance. I suspect it will be turned into a tax penalty to avoid a straightforward law and side-step constitutional questions.

I don’t have an agenda. But if I have an opinion, I may jeopardize the perception of fairness and accuracy in anything I do in the world of civics. Can I have an opinion and still be trusted to be fair when I put my nonpartisan hat on? I’m not even partisan. I vote Democratic, but so does most everyone else in the places I’ve ever lived. Am I allowed to say that? Have I lost credibility merely for being more open about my views?

And this is what I imagine journalists go through. They vote too, I hope. If they write for the New York Times, they probably live in New York and vote like most New Yorkers. But then they turn off their passion when they put their fingers down to the newsroom keyboard. And we suspend disbelief for a moment as we read their articles that journalists can’t have opinions and be fair at the same time. They make it easy for us to suspend disbelief because they write like they’re dead. No interest in the outcome. They’ve got to write a few words because they need to pay for the electricity that keeps their computers going, but if newspapers paid them to write a summary of the tax law they’d do that too. It doesn’t matter to them, at least as far as we can tell from reading.

This is ridiculous and, worse, counterproductive. I’d be more interested in news if articles pleaded with me that the issue was important, that it isn’t a conceptual exercise but that it even matters to the reporter. This is, apparently, how news used to be 100 to 250 years ago. It’s how the most compelling documentaries and long-form video news segments are today. Of course, it was also not very reliable 100-250 years ago. But I don’t think that dichotomy has to be so today. If we opened ourselves up to the idea that a reporter could have an opinion and still be fair, we wouldn’t need to suspend disbelief. Reporters wouldn’t have to die each time they start writing the next piece.

I don’t want reporters to die. Save the reporters. (Ironic hyperbole.)

GSA social media TOS review

May 29th, 2009

Recently the GSA has been negotiating on behalf of federal agencies special Terms of Service agreements with various social media services like YouTube to allow agencies to make use of these services — some of those agreements are now publicly available. My understanding is that GSA’s negotiations were necessary before agencies could use these services because of legal issues like liability. I’ve reviewed the TOS’s to see whether they address open-government concerns.

The use of non-governmental services like these as part of a governmental function raises several openness issues, which we rehashed in an earlier thread on the use of YouTube by Congress. To summarize, the issues include:

  • whether the service provider meets government web standards including accessibility, privacy, security, nondiscrimination, and archival access to media
  • whether the service providers require members of the public to enter into a contractual agreement with them (i.e. more terms of service) in order to access government content, what the public must agree to, and whether these additional terms with the public restrict what the public can do with and how the public can share government media obtained through the service
  • whether use of the service constitutes an endorsement of a particular brand or technology, or if it provides a significant business advantage to a profit-seeking entity
  • whether the service provides government media in a data format that does not impose technical and legal restrictions on users of the media

I think I need to include a special note about privacy. We can expect that any non-governmental site is going to track their users’ behavior as best they can because of the financial incentives of user-targeted advertising and selling demographic data. I don’t know to what extent any of the services below make use of this, but I expect that this is a major component of the revenue of all of them.

The GSA TOS are amendments to the standard TOS employed by the services. I haven’t read through any of the standard TOS, so of course I might be missing something.

I reviewed the TOS’s with respect to these issues. They had common elements.

No Advertising: The service agrees to not place advertisements on pages with government content (i.e. government “channels” and the like). This addresses part of the concern of endorsement. Of course, services may continue to display their brand name and link users to other parts of the service, so they are still able to promote their business.

No Cookies: The service agrees to not set cookies when a widget is placed *in* a government agency webpage. This means that the service gives up its ability to do the most advanced user tracking in the cases where the user may be unaware that they are even accessing a non-governmental service. The service may still track accesses by IP address which still may provide a more rudimentary means to track users but is more likely to be anonymous.

Closed Captioning: The service will provide the ability for government media to include closed-captioning for videos using industry standard practices, which is of course important for accessibility.

The TOS are linked from here:

https://forum.webcontent.gov/Default.asp?page=TOS_agreements

Here are reviews of each TOS:

AddThis.com

AddThis.com provides a “Social Bookmark & Feed Button Builder”. It’s a widget developers can put on their websites to help users share content on other social sites like Facebook. Because of what AddThis does, there are only a few concerns to be addressed. The TOS addresses both main concerns:

- No Advertising — on government “channels” on the AddThis site
(doesn’t seem to actually apply to AddThis).
- No Cookies — when placed on .gov/.mil websites.

https://forum.webcontent.gov/resource/resmgr/terms_of_service_w_socmed/addthistos_4.30.09_final_uns.pdf

Blip.tv

Blip.tv is a video hosting website, like YouTube. All of the privacy concerns above are relevant. The TOS includes:

- No Cookies — Blip.tv will allow the government to disable parts of its embeddable player that sets cookies.
- Closed Captioning.

Other aspects require some elaboration:

Ads-

The GSA TOS has a confusing section on advertising:

“Blip.tv reserves the right to run advertisements on any page on Blip.tv, but will not run advertisements in-stream or directly adjacent to user videos without the opt-in of the user who uploaded the video.”

It sounds like Blip.tv can place ads on the page, just not directly adjacent to or inside of government media.

Privacy-

The GSA TOS say explicitly that Blip.tv does not collect personally identifiable information about users, but does collect and use demographic data for targeted advertising. Users should expect to be asked demographic data.

https://forum.webcontent.gov/resource/resmgr/Docs/Blip_tv_-_Terms_of_Use_Agree.doc

Blist.com

This is a “social data discovery” tool where users can upload tabular data sets to share. I am actually going to skip a review of this TOS because I expect (or sincerely hope) that tabular data sets are shared with the public directly (as a bulk data download) besides through social tools.

Facebook

Facebook is a social networking site.

The negotiated TOS are not available to the public. Given the likely pervasiveness of the use of all of these tools in the future, it would be a shame if the GSA is facilitating government agencies’ use of third party services that violate the public’s expectations for government web content.

Flickr

Flickr is a photo sharing website. All of the concerns listed above are relevant to Flickr. There are no provisions in the TOS relevant to this review.

https://forum.webcontent.gov/resource/resmgr/Docs/Flickr_TOS_Agreement_Amended.doc

MySpace

MySpace is a social networking site.

The negotiated TOS are not available to the public. Given the likely pervasiveness of the use of all of these tools in the future, it would be a shame if the GSA is facilitating government agencies’ use of third party services that violate the public’s expectations for government web content.

SlideShare

SlideShare is a document (i.e. presentations) sharing tool. The TOS are not posted on the GSA website, but this appears to be a publishing mistake as it notes that the TOS are intended to be publicly available.

Vimeo

Vimeo is a video sharing tool. The TOS has one relevant provision, despite all of the concerns being relevant.

- No Advertising — on government “channels” on the Vimeo site

https://forum.webcontent.gov/resource/resmgr/terms_of_service_w_socmed/vimeo_tos_final_april2009.doc

YouTube

YouTube is a video sharing site.

The negotiated TOS are not available to the public. Given the likely pervasiveness of the use of all of these tools in the future, it would be a shame if the GSA is facilitating government agencies’ use of third party services that violate the public’s expectations for government web content.

Conclusion

While I am encouraged by the GSA’s forward thinking to make use of the latest technologies developed in the private sector, I believe that working with the private sector poses a number of risks to government data, to the public’s privacy and free speech rights, and to good governance. These risks can be minimized and some useful provisions have been included in the negotiated TOS’s along these lines, but far more careful thinking is necessary.

While several of the TOS addressed accessibility and privacy concerns, none of the TOS addressed security, nondiscrimination, archival access to media, the TOS the public are required to enter into to access government content through these services, and web media data formats.

Update: See also this related post.

Open Data is Civic Capital: Best Practices for “Open Government Data”

May 19th, 2009

I frequently see questions like how can I convince my government that open data is important?, and what should I do as a government web manager to make data open?. These and other questions came up at Transparency Camp a few months ago, and at the end of the conference Gunnar Hellekson of Red Hat, and later I, decided to take on the project of bringing together a repository of best-practices guides for technology’s role in an open government. We have a wiki page for the project which lists some of the guides we’d like to see written.

Since the conference I’ve been working on the first guide, Open Data is Civic Capital: Best Practices for “Open Government Data”, which you can read by following the link. The goal was 1) to motivate why open government data isn’t just an ideological issue but actually makes society more powerful, and can really make the world a better place, and 2) to outline some suggested priorities and recommendations for open government data, drawing on the recommendations of a number of past groups (e.g. the 8 Principles of Open Government Data, and others). Thanks for feedback to Gunnar, John Wonderlich, Carl Malamud, Joe Germuska, Kevin Lyons, and David Robinson. (They had a lot of great suggestions many of which I haven’t had the energy to follow through with yet.) The essay begins:

“Creating a well-informed public is a core value of representative government. It is a prerequisite for ensuring the best representatives are elected and a crucial component of government oversight—as well as being important in areas well beyond civics. This document speaks to why public government data (also called ‘public sector information’) is a valuable resource to society if put on the Web and shared freely with the public, and discusses how to go about doing it. We discuss technological considerations and end with sixteen guiding principles for best practices in open government data.”

Kevin Lyons, who works for the Nebraska State Legislature, began work on a best practices guide for the use of the PDF format. When is it appropriate, what to look out for. That’s up on the wiki and I’m sure your suggestions & revisions would be welcome.