Archive for March, 2008

More money and votes: Now I know how to explain the problem

Thursday, March 27th, 2008

Let me give you two headlines, and you can tell me your reaction to each:

A) Big Oil Finances the Republican Party
B) Congressional Votes Correlated with Big Oil Contributions

From the headlines you’d think the articles are about two separate facts about the world. That is, that the two facts are independent. One can read the first article and then still be surprised after reading through the second article. A friend of mine says about the second hypothetical headline, “I think the votes are sort of taking it to the next step of association.” Compare those with these headlines:

A) Factory Emissions Tied to Deadly Cancer
B) Life Expectancy Lower in Factory Towns

Here you’d say the articles are about the same thing: clearly, because more deadly cancer means more deaths, and that means lower life expectancy. If everyone already knew (A) from an expose last year, when a newspaper reports (B) we say what an idiot the reporter must have been — he just reported the same thing again.

That’s what’s happening with money-and-votes analyses. The facts are more complicated, and the results are less clear, so it’s easy to overlook the problem here. But the problem is here.

We all already know that Big Oil gives more to Republicans than to Democrats. If you didn’t know, you know now. It’s an interesting point; no qualms there. Since the Republicans support big business and the Democrats support the environment (whatever, you get the idea), it makes sense. Fine.

But because of that, I know immediately that any vote related to Big Oil is likely to go down with an uneven distribution of money coming from Big Oil between Yes votes and No votes. In fact, it has nothing at all to do with Republicans and Democrats having different views on oil in particular. It’s just that Republicans and Democrats either all vote together (on naming post offices) or vote against each other (on everything else). The votes where everyone agrees are not relevant here: you can’t have an uneven distribution of money between Yes and No votes when there aren’t No votes to begin with. Since the relevant votes are almost always split on party lines, of course there is going to be a correlation between money and votes.

What I mean by of course is that no one should be surprised to learn about a big correlation between money and votes if it has already been established that there is a correlation with contributions to a particular party. Finding out the magnitude, in dollars, of the correlation doesn’t change anything. It might as well be reported as “Big Oil Gives $XXX more to Republicans than to Democrats.” This headline is less exciting, but it’s the same thing. Throwing in votes just makes it sound more important, and it is misleading because it makes it sound like there is something new and nonobvious to be learned.

Let’s look at what is being reported. I paraphrase from Follow the Oil Money (sorry guys):

Of the 25 Representatives who took the most Big Oil money per term between 2000 and 2007 Representatives, 23 were Republicans. Of the 25 Representatives who took the least amount of Big Oil money per term between 2000 and 2007, 22 were Democrats. … Representatives who voted against clean energy proposals took more than 4.5 times more oil money than those who voted in the public interest.

Why not just say “Republicans” took more than 4.5 times more oil money than “Democrats”? (Well, the number may change a bit of course, but that’s the idea.) The votes have nothing to do with it unless it is showed that the votes were something other than decided roughly on party lines. It’s another question entirely whether the money influenced the votes, or whether votes influence future contributions — a question that is unsolved.

I raised a similar issue previously with some numbers from MAPLight. To paraphrase their analysis and interjecting my own totals:

Opponents of H.R. 1424 gave an average of $22,479 to Republicans and $12,646 to Democrats. These industry groups gave an average of $22,693 to legislators who voted No on this bill, compared to $14,183 to legislators who voted Yes.

Can you guess what happened? It’s no accident that $22,479 is close to $22,693 and $12,646 is close to $14,183. The Republicans predominantly voted No and the Democrats predominantly voted Yes. Actually the vote wasn’t exactly evenly split, which makes the results more interesting. You can still see an effect of money on the vote beyond the party difference, as I noted in that post, but it’s a much smaller correlation.

What do you want to do with word frequencies?

Monday, March 10th, 2008

John Wonderlich wrote:

After defining (and normalizing) the likelihood that words appear in text, you could start making comparisons between bodies of work, and creating interesting tag-cloudish visualizations of what distinguishes some text you’d like to analyze. You could build a widget for your blog that says “the following are the words that are more than 25% more likely to be used on this blog than they are to be used in New York Times cover stories”, or, “here are recent news stories that also have similarly unlikely words used.”

I don’t know how people usually do cloud visualizations, but if I were
making a word cloud, that’s *precisely* what I would do — i.e. this is
probably how people do it.

See:
http://en.wikipedia.org/wiki/TFIDF

http://en.wikipedia.org/wiki/Latent_Semantic_Indexing

Now, the thing is that word counts actually don’t get you very much information. Remember back to the days before Google- search engines gave you back documents by matching words and returning documents where you search terms appeared most frequently. Then Google came along and ranked documents differently and we all saw how *awful* word frequency was for determining relevance to a query.

So the question is what you would use word counts *for*. Clouds are nice, but look for cases where words aren’t exactly the appropriate level of chunking to identify relevance. (And, you will see this in most word clouds.) Articles back in 2004 about the Democratic ticket might have used the word “John” an exceptional amount owing to the dynamic duo’s shared first name, but “John” in a word cloud isn’t very informative. You’d want to chunk whole names together, but that’s a difficult problem in itself.

Note also for comparing documents that the frequency of a word isn’t very indicative of a word’s prominence in a text, and if you have a profile (i.e. vector) of word frequencies for two documents, it’s not immediately obvious how you would compare profiles to arrive at whatever result you want. (Not to say there aren’t ways to do it, but that there are many ways to do it.)

Money is not quite so big of an incentive for voting with your wallet

Friday, March 7th, 2008

I like to be devil’s advocate among my friends, and since MAPLight and Sunlight are some of my friends, they can’t get out of a careful look over their analyses. Ellen writes on her blog about an analysis provided by MAPLight of the correlation of contributions to representatives and their vote on H.R. 1424 (bill | vote | MAPLight page):

They found that those “interested” in the legislation, both pro and con, gave over $8,000 more to the individual legislators who voted the way they wanted them to. A press release from Maplight.org gives more detail:

Opponents–such as Accident and Health Insurance, Big Business, Chambers of Commerce, Restaurant and Manufacturing, Retail and Wholesale Trade gave an average of $22,693 to legislators who voted No on this bill, compared to $14,183 to legislators who voted Yes. The disparity is 160% [JT- that’s 60%!] more money given to a No vote.

Supporters–such as Health and Welfare, Mental Health care-givers, Mental Health Services, Clergy and Non-profit–gave an average of $4,242 to legislators who voted Yes on this bill, compared to $1,812 to legislators who voted No. The disparity is 234% more money [JT- should be 134%] given to a Yes vote, or $2,430.

…. Dan Newman, MAPLight.org’s director, … points out that campaign contributions are just one factor in determining how a legislator votes, and they do not claim one caused the other. “We do make the claim, however, that campaign contributions bias our legislative system,” he adds. “Simply put, candidates who take positions contrary to industry interests are unlikely to receive industry funds and thus have fewer resources for their election campaigns than those who vote in favor.”

I don’t suggest the numbers reported are wrong (well, actually, the percent changes are wrong), but the relevant disparity in money, as far is it could be tempting motivation for a legislator to change his position, is much smaller than MAPLight reports.

The trouble with MAPLight’s analysis of the correlation, even putting causality aside, is that contributions are correlated with party membership, and so are votes. So it’s no surprise there is a correlation between money and votes. If I give only to Democrats and equally to all Democrats, it will appear as if I’m giving money only to those voting in favor of Democratic issues — even though my contributions have not taken into account any particular issue position. Further, and importantly, even though you will see this correlation between my money and votes, it does not mean there is any incentive for a Democrat to change his position on an issue. That’s because in my hypothetical I am giving equally to all Democrats. The only incentive is for a Republican to become a Democrat to get some of my money, but that rarely happens. Bottom line: correlation doesn’t immediately establish incentive.

Returning to H.R. 1424, what we need to do is split the Members by party. The incentive for a Democrat can only be established by looking at the money going to Democrats. In this case, only three Democrats voted No on the bill, and three Democrats is not a large enough sample to come to any conclusions about anything (t-test be damned).

As for the Republicans, industry groups opposing the bill gave an average of $22,850 to Republicans voting against and $19,525 to those voting in favor (leaving out a clear outlier, in MAPLight’s favor). Yes, more money went to those voting against, but only $3,325. That’s a 17% difference, not a 60% difference. (It’s also a relatively small amount compared to the variability in the contributions just within the yes or no vote groups separately.)

That just leaves the contributions to Republicans from industry groups supporting the bill. Here MAPLight’s point stands. An average of $3,630 went to those voting yes and only $1,865 to those voting No. That’s a big difference, around $1,765, but still smaller than what MAPLight reported.

So here’s the bottom line: The incentives for Members of Congress to vote according to their war chest is far smaller than what is evident from MAPLight’s analysis because representatives are not competing for money going to the other party. By looking at Republicans alone, we see that it is true that money from groups supporting the bill went more to those voting in favor of the bill, but with a difference of only $1,765 (nevertheless, nearly a 100% increase over the no-vote amount). However, while there was a lot more money at play from groups against the bill, the difference between the yes voters and no voters was $3,325 (a 17% increase over the smaller of the figures), a much smaller incentive than the $8,500 reported by MAPLight.

Party Transparency: Isn’t there an elefant in this room?

Friday, March 7th, 2008

A shiver, well at least a small one, goes down my spine every time I see transparency and claims about fairness mixed in with party politics. There are two big issues running around, the first being superdelegates, the back-room deals, and uncertainty over the fairness of a confusing multi-level delegate-based system to choose party candidates. What bothers me here is that registered Democrats choose to be registered Democrats. Unlike in government transparency where if you live here not only do you not choose to be subject to U.S. law but you also have no other alternative governments to choose from, in politics you are free to choose any party or start your own.

I’m not so heartless to not think that it’s unfortunate that the decision-making process to choose the national candidates is as opaque as it is, but why isn’t anyone talking about why people actually aren’t free to choose alternative parties? That’s the elephant that ought to be in this room. In commerce, when things are unfair for a lack of options we cry monopoly and get things rectified by the FTC. In politics, why isn’t anyone complaining of the same?

The second issue is the so-portrayed disenfranchisement of Michigan and Florida Democratic voters on account of their states flaunting the national committee’s directive over primary dates. Do we penalize the voters there for the actions of their state party leaders? I don’t see how the voters are being penalized. The voters elected their party leaders to make the decision over the primary dates: It’s too bad their elected leaders did something stupid once in office (as elected officials often do, right?). Clearly the public acquiesced to the decision in any case. What’s the recourse? Besides switching parties, citizens can vote to fire the elected officials when the next election comes around.

But where’s the elephant? It’s difficult to fire party leaders when they control the candidate selection process. Do I vote Republican in the next general election, going against my core beliefs, because the incumbent Democrat goofed on a non-governmental issue? Probably not. There obviously won’t be a serious Democratic challenger either, and certainly not one who is going to use this as a campaign issue if he wants any support from his party.

For good reason there are few legal restrictions on how parties operate internally — after all, free and fair elections means freedom from government oversight. But without rules imposed from above, there needs to be freedom of choice. That’s the real issue here, not transparency and accountability.