Archive for the ‘Code’ Category

Some programming-related posts that don’t fall into other categories.

Programming Language Syntax

Friday, April 22nd, 2005

If you know a little bit about how compilers work, you know that the syntax of programming languages is context free. That is to say that each syntactic element of the language can be described as a list of sub-elements, regardless of what context it appears in. For example, a while-loop in C# is (roughly) the keyword ‘while’, followed by an expression, followed by a statement (or block of statements), and it doesn’t matter where the loop appears, that syntactic definition is always the same. This is basically the idea of a context free grammar (CFG).

Natural languages (i.e. human languages) are not context free. It’s impossible to come up with a (concise) list of CFG rules, such as “a sentence is a noun phrase, followed by a verb, followed by a noun phrase; and a noun phrase is an article (a ‘determiner’ in the biz) followed by a noun” to describe English, for instance. That will work for simple sentences like “a man walked a dog”, but not for sentences like “which dog do you think a man walked?”

Now, this raises the question of why we don’t program in languages that closely resemble natural languages, in terms of syntactic structure? Wouldn’t that make it easier to program? There’s a good reason why we don’t do that, actually: No one knows what the syntax of natural languages looks like. Try as we might, natural languages are still beyond our understanding.

The reason I’m writing this is that I just came back from a symposium in honor of one of my professors (I study the syntax of natural language, by the way) who invented Tree Adjoining Grammar. TAG is a type of syntactic formalism that can actually be used to describe English fairly well — in the way that CFGs don’t even come close. At a very high level, TAG adds to CFG the ability to splice together two units of structure. I was wondering whether a TAG-based programming language syntax would let us program with new types of syntactic sugar, although I think the answer is that nothing interesting would come out of it.

Diffing and RDF

Saturday, March 5th, 2005

If you’re reading this, you’re probably reading this on Monologue, and that means I’ve successfully added myself to Monologue. :-)

Recently I got a helpful bug report for my Diff library for C# which pointed out that my port of Perl’s Algorithm::Diff wasn’t generating the same diffs as the original module. I fixed the bug and reposted a new version of the library.

In unrelated news, I’m working on building the semantic web for information about the U.S. government. This is a spin-off of my work on GovTrack (which is powered by Mono). To get this web built, I’m in the position of having to convince people that RDF is the right way to approach the problem of distributed information — over, for instance, XML, XML Schema, and XQuery. The problem is that RDF is complicated and often misunderstood, and I hadn’t found a good document explaining what RDF is and why it should be used for this. So, I wrote one. I’m not a master of RDF by any means, so any corrections and suggestions are welcome.

By the way, if you’re interested in building this political semantic web, join the GovTrack mail list.

Lastly, with my new interest in RDF, I was looking for a good C# library for working with RDF data models. I didn’t find one that I particularly liked (there are a few ones out there, but for various reasons I just couldn’t see myself using them), so I’m working on my own. I’ll post the source in a few weeks, probably.

Mail Lists, Diffs, XPD

Wednesday, November 3rd, 2004
For anyone potentially reading, I’ve set up new mail lists for GovTrack and my Thunderbird SPF Extension. If you have an interest in either thing, please visit the site and join the list.

I’ve also posted a library for diffing/merging/patching written in C#, based on the Perl module Algorithm::Diff. And I posted the source for XPD, the XML pipline document generation engine that I wrote to power GovTrack. These things have helped me; I hope they help you.

GovTrack is now set up on a new server, and it’s much much more responsive than it used to be. In fact, you can’t tell anymore that it’s doing lots of XSLT transformations on each request.