This is the home page of
Joshua Tauberer <>, University of Pennsylvania graduate student in linguistics
and software "technologist" interested in civics, the semantic web, and scientific computing.
About Me
A few favorite quotes:
"As certain as my heart is ticking, I'm certain no living chicken //
Has ever so clearly commanded a living cook before //
With an utterance so clear and shocking that even I could not ignore. //
Quoth the chicken, 'Fry some more.' " --- Good Eats
"Yields falsehood when preceded by its quotation! Yields
falsehood when preceded by its quotation!" Achilles to
Tortoise (in "Gödel, Escher, Bach" by Douglas Hofstadter)
"If the crib's on fire you don't speculate the baby's
flame retardant." -- Al Gore on global warming
"Strike up the klezmer and start acting like a man. You're
about to have a truth-mitzvah." -- Stephen Colbert, The Colbert Report
I am a fourth-year graduate student at the University of Pennsylvania
in the linguistics Ph.D. program. My
primary academic interest is currently phonetics & models of OT
grammar acquisition, but I'm also interested in formal syntax and semantics.
My main hobby is using technology to improve transparency in the U.S.
Congress. In my spare time I run GovTrack.us, a website that tracks
what's happening in Congress. It basically runs itself. I also
contribute to The Open
House Project and am behind the scenes in a few other civics-related
projects.
My interests in transforming government data and linguistic semantics
has lead me to an interest in the Semantic Web. The Semantic Web, in
combination with some advances in linguistic technology, should
dramatically change the way we interact with information. In that
spirit, I maintain a few Semantic Web resources at rdfabout.com.
I spend a lot of my time on computer software. Some of my other
software projects are listed below. My "third" life is as a
component of the structural engineering software company
LARSA, which I've been with for
almost eight years now.
Recently I've also found a new interest in social responsibility,
especially issues related to food. I am a locavore: I try to purchase locally grown
and responsibly raised food when I can.
These are all ongoing projects, though I don't spend much time on all of them.
Civics/Politics
GovTrack.us is a nexus of information
about the United States Congress, primarily tracking the status of legislation.
GovTrack lets you monitor your representatives and legislation in topics
that interest you through email updates, RSS feeds, etc. The site
was mentioned in the NY Times
and the Washington Post.
The site pretty much runs itself. Some 15,000 people visit the site each day. (Since 2003.)
Praat-Py is an extension to the Praat program
for phonetic analysis that allows scripts to by written in Python. (Since 2007.)
Technology (for Technology's Sake)
Sender Verification Extension is a Mozilla
Thunderbird extension for verifying the domain name claimed in the
From: address of emails using SPF and DomainKeys, as a tool to combat
phishing. I'm not actively developing the extension much, mainly just
maintaining it. Apparently some 9,000 people are using the extension! (Since 2004.)
SemWeb .NET Library is a .NET library written
in C# for working with RDF data for the Semantic Web. (Since 2005.)
U.S. Census as RDF:
A 1-billion triples RDF database of U.S. Census statistics, basically
the largest open, linked, and dereferencable RDF database of real-world
information that exists.
Other projects that I no longer maintain:
Diff C# library,
PerlSharp (Perl interpreter bindings),
SvnWebView (CGI/Perl
script to browse a SVN repository).
Publications and Conference Papers/Talks
On government transparency:
Talk: Jan. 15, 2008. Open government data policy and a semantic future for civics. Presented in Civics in the Cloud panel at "Computing in the Cloud", Center for InfoTech Policy, Princeton University. text | slides | watch
(You might also be interested in Introduction to RDF, which is more extensive than the above.)
Linguistics:
(Accepted talk: "Predicting Intrasentential Pauses: Is Syntactic Structure Useful?" for Speech Prosody 2008)
Goldilocks Meets the Subset Problem: Evaluating Error Driven Constraint Demotion for OT Learning. To appear in the U. Penn Working Papers of Linguistics 15.1, 2009, Proceedings of the 32nd Penn Linguistics Colloquium. Slides
Joshua Tauberer, Aviad Eilam, and Laurel MacKenzie (Eds.). 2008. Proceedings of the 31st Annual Penn Linguistics Colloquium. Penn Working Papers in Linguistics, Vol. 14.1, Philadelphia, PA.
Tatjana Scheffler, Joshua Tauberer, Aviad Eilam, and Laia Mayol (Eds.). 2007. Proceedings of the 30th Annual Penn Linguistics Colloquium. Penn Working Papers in Linguistics, Vol. 13.1, Philadelphia, PA.
Aviad Eilam, Tatjana Scheffler, and Joshua Tauberer. (Eds.). 2006. Proceedings of the 29th Annual Penn Linguistics Colloquium. Penn Working Papers in Linguistics, Vol. 12.1, Philadelphia, PA.
Sudha Arunachalam, Tatjana Scheffler, Sandhya Sundaresan, and Joshua Tauberer (Eds.). 2005. Proceedings of the 28th Annual Penn Linguistics Colloquium. Penn Working Papers in Linguistics, Vol. 11.1, Philadelphia, PA.
I like old maps, "Scrubs", "The Colbert Report", the em-dash, and the
expression "be that as it may". Oh, and the "hand" "quote" "thing." Also...
Philosophy of Mind
I've always been a little troubled by the fact that I exist. Sometimes
when I think about it too much I actually feel a bit of surprise when
I realize again that, in fact, I am here. (More of my ramblings...)
Complaining is easy and finding a solution is tough, I know,
but there are a lot of clearly-ridiculously-wrong things in
the world that corporations hide from individuals, and in these
cases we can't make progress until we first identify the problems.
Here are some things that ought to be changed:
These are thoughts in progress.
Big media influences politics to create money. As one
example of this, I reported on my suspicion that MSNBC's Oct. 30, 2007
Democratic presidential debate was crafted to allocate time to
candidates in proportion to their latest poll numbers. Obviously
candidates with less time make less of an impression on the public,
and thus are actively put at a disadvantage by the system. When the
pundits later ask "who was the winner" of the debate, their corporate
cohorts have already forced the answer in favor of those candidates
already in the lead. And, to be even more cynical, by limiting the
number of viable candidates, they increase the power of their campaign
contributions that they have made to leading candidates.
Credit card restrictions on merchants lead to an
inverted redistribution of wealth. "[C]onsumers who use the cheapest payment systems [i.e. cash] are
likely to end up paying more, and consumers who use expensive payment
systems [i.e. credit cards] are likely to end up paying less
[than they would if merchants adjusted prices according to the cost
of processing the different types of payments they receive, contra
bank-imposed merchant restrictions].
The effect is a sub rosa cross-subsidization of those using the most
expensive payment systems by those using the cheapest. This
cross-subsidization is highly regressive because the poorest
Americans tend to be cash-only consumers." Adam Levitin (2007):
Priceless? The Social Costs of Credit Card Merchant Restraints)
(The point is also made that the restrictions limit the purchasing power
of food stamps, being cheap for merchants to process, which means that
taxpayers as a whole are also subsidizing the banks and the rewards. Then, and this is my point not his,
considering the popularity of airlines rewards, taxpayers are actually
subsidizing the airline industry through this channel.)