Curriculum Vitae

Joshua Ian Tauberer, Ph.D.

Professional Experience

2016–presentGovReady, Contractor
Development of an expert system for federal cybersecurity compliance (FISMA).
2016EveryCRSReport.com, Developer
Built the first continuously updated website to make all Congressional Research Service reports available to the public, including the development of a PDF redaction tool. With/for Demand Progress.
2016–presentOpen Government Advisory Group to the DC Mayor, Public Member
Advises the District of Columbia mayor on, and monitors, the implementation of open government programs by DC agencies.
2014–presentif.then.fund, Co-founder
Reshaping Congress by empowering small dollar donors to make contributions based on what politicians do — not what they promise.
2003–presentGovTrack.us, Founder
A reference and legislative tracking tool for bills before the U.S. Congress. It is one of the most visited government transparency websites in the world and spurred the world-wide open government data movement. Launched in 2004. I built and still build the website, and I advocate for greater transparency from Congress.
2000–presentLARSA, Inc., Senior Technologist
Development of some of the user interface and several analytic components of this desktop software for structural analysis and design of bridges and other structures.
2011–2016Open Data Day DC, Lead Organizer
I (co-)started this yearly event in the District of Columbia for open data enthusiasts. It is on the same day as open data hackathons around the world. More than 300 participants joined us in 2014 and 2015.
2014–2016U.S. law codification comparison tool, U.S. Congress
Developed an internal workflow tool for the positive law codification process at the U.S. House of Representatives, Office of the Law Revision Counsel, with/for Xcential and Robinson + Yu.
2013–2014District of Columbia law publication tool, DC Council
Overhauling and modernizing the publication process for the Code of the District of Columbia, for the Council of the District of Columbia, Office of the General Counsel.
2012–2014Open data catalog for the U.S. Department of Health & Human Services
As a sub-contractor for the U.S. Department of Health & Human Services I helped develop HealthData.gov, a catalog of HHS agency datasets, based in part on CKAN.
2010–2012POPVOX, Co-Founder and Chief Technology Officer
POPVOX.com is a venture-backed online advocacy platform which I co-founded in 2010. The site help citizens and grassroots advocacy associations contact Congress and build their base. I left the company in early 2012. During my time there I supervised a small technology and product development team.
2003The Daily Princetonian, Features Editor
I was the last “executive editor for Page 3” at the student-run newspaper in college.

Education

2010 Ph.D.University of Pennsylvania, Department of Linguistics
My dissertation “Learning [voice]” investigated the phonetics-phonology interface of the voice contrast in infant speech through a corpus analysis.
2008 M.A.University of Pennsylvania, Department of Linguistics
My masters thesis was “Learning in the Face of Infidelity: Evaluating the Robust Interpretive Parsing/Constraint Demotion Model of Optimality Theory Language Acquisition.”
2004 A.B.Princeton University
Psychology major with certificates (minors) in linguistics and applications of computing. My senior thesis was about discourse representation.

Publications

Invited Talks / Media Appearances

Additional Presentations and Manuscripts

Press Clips/etc. (selected)

Honors

Professional Service

Additional Technology Projects

2013–presentMail-in-a-Box, creator
Take back control of your email with this easy-to-deploy mail server in a box.
2013–presentANCFinder.org, team member
A project of Code for DC to increase engagement in DC's Advisory Neighborhood Commissions. [press: WAMU]
2012–presentgithub.com/unitedstates
Co-authored with other legislative technology geeks, this library pulls in and organizes information about the U.S. Congress.
2014Scrap Stats, team member
A project about food waste developed at the National Geographic Future of Food Hackathon, May 3-4, 2014.
2005–2009SemWeb .NET Library
An open source .NET library written in C# for working with RDF data for the Semantic Web. It's used in the Gnome application F-Spot, and possibly elsewhere. I created it becaues I thought the Semantic Web would be big! [github]
2009New Jersey Gang Survey Viewer (organizer)
This is a visualization tool for the New Jersey State Police Street Gang Survey 2007. It was developed by five volunteers in Philadelphia over the course of a weekend in December 2009, as part of the Great American Hackathon.
2009FlyOnTime.us
This entry for Sunlight Foundation's Apps for America contest, in collaboration with Josh Sulkin, is a mash-up of airline on-time flight statistics from the FAA with historical weather data from the NOAA. Mentions: White House open gov status report, The New York Times (3/12/11), NPR (3/14/10), The Washington Post (7/21/09), The Politico (6/24/09).
2007–2008Praat-Py
This is an extension to the Praat program for phonetic analysis that allows scripts to by written in Python. I created it to help with (procrastinate doing) my PhD thesis. [github]
2006–2007The Penn Lambda Calculator
This is a linguistic semantics pedagogical tool made in conjunction with Lucas Champollion and Maribel Romero.
2004–2007Sender Verification Extension for Thunderbird
A Mozilla Thunderbird extension for verifying the domain name claimed in the From: address of emails using SPF, as a tool to combat phishing. Downloaded around 150,000 times. Doesn't work anymore. [github]
2007OpenGovData.org, participant
Formed out of a 2007 meeting of open government activists and professionals, this site presents the Eight Principles of Open Government Data and lists related information. As of 2012, I am still maintaining the website.
2007/2008Semantic Web Databases
U.S. Census RDF Dataset: A 1-billion triples RDF database of U.S. Census statistics, at the time the largest open, linked, and dereferencable RDF database of real-world information. U.S. SEC Corporate Ownership RDF Data: A semantic web RDF database based on the U.S. Securities and Exchange Commission’s EDGAR database.
1999Webcytology [more info]
My first major web project (I was in high school), this was a winning entry in the 1999 ThinkQuest competition. It featured a cellular automata simulation, inspired by Conway's Game of Life, where users would design organisms with different biologically inspired properties. In collaboration with Andew Kallem.

Academic Publications (grad school years)