Berlin SPARQL Benchmarks for my SemWeb .NET Library
Chris Bizer and team have posted a benchmark specification for SPARQL endpoints, the Berlin SPARQL Benchmark (BSBM). They have “run the initial version of the benchmark against Sesame, Virtuoso, Jena SDBÂ and against D2R Server, a relational database-to-RDF wrapper. The stores were benchmarked with datasets ranging from 50,000 triples to 100,000,000 triples” (announcement email).
I ran the benchmark against my SemWeb .NET library. Instructions for setting up the benchmark are here and turned out to be a good example for how very quickly to set up a SPARQL endpoint using my library, backed with your SQL database of choice (in this case MySQL). I had some trouble the first time I ran the benchmark though:
- The first time I ran the tests I found the library had several bugs/limitations: a bug preveting ORDER BY with dataTime values, an error parsing function calls in FILTER expressions, and a glitch in the translation of the query to SQL. I corrected these problems.
- Query 10 must be modified to change the ordering to ORDER BY xsd:double(str(?price)), which adds the cast xsd:double(str(…)), since ordering by the custom USD datatype is not supported and not required to be supported by the SPARQL specification.
- In the same query, in FILTER (?date > “2008-06-20″^^<http://www.w3.org/2001/XMLSchema#date> ), xsd:date comparisons are not a part of the SPARQL spec (as I understand it; dateTime comparisons on the other hand are required by the spec). Such comparisons weren’t implemented in my library, but I went ahead and added it.
Also I have some concerns. First, I am not 100% sure if the results of my library are actually correct. Query 4 seemed to always return no results. Second, queries are largely translated into SQL, and there is a good deal of caching going on at the level of MySQL. The benchmark results then are saying a lot about the best-case run time, and indicate something about the overhead of SPARQL processing, but may not indicate general use performance.
Benchmark results reported below are for my desktop: Intel Core2 Duo at 3.00GHz, 2 GB RAM, 32bit Ubuntu 8.04 on Linux 2.6.24-19-generic, Java 1.6.0_06 for the benchmark tools, and Mono 1.9.1. This seems roughly comparable to the machine used in the BSBM.
Load time (in seconds and triples/sec) is reported below for some of the different data set sizes.
| 50K | 250K | 1M | 5M | 25M | |
|---|---|---|---|---|---|
| Time (sec) | 224 | 16129 | |||
| triples/sec | 4441 | 1544 |
For comparison, load time for the 1M data set was 224 seconds. This is about double-to-2.5 times (worse) the time of Jena SDB (Hash) with MySQL over Joseki3 (117s) and Virtuoso Open-Source Edition v5.0.6 and v5.0.7 (87s), as reported in the BSBM results. For the larger 25M dataset, the load time at 4.5 hours was only 1.2 times slower than Jena SDB but 1.7 times faster than Sesame over Tomcat and 3 times faster than Virtuoso. (But, again, the machines were different.)
Results for query execution are reported below. AQET (Average Query Execution Time, in seconds) is reported below for each of the queries for different data set sizes. The results were roughly comparable again to Jena and Virtuoso. But, again, the three caveats above are worth restating: the query results are not validated to be known to be correct, there is significant caching, and the machine was different than the machine used in BSBM.
| 50K | 250K | 1M | 5M | 25M | |
|---|---|---|---|---|---|
| Query 1 | 0.019184 | 0.049200 | |||
| Query 2 | 0.051187 | 0.048590 | |||
| Query 3 | 0.030508 | 0.079187 | |||
| Query 4 | 0.032693 | 0.075603 | |||
| Query 5 | 0.172283 | 0.342828 | |||
| Query 6 | 0.102105 | 3.277656 | |||
| Query 7 | 0.256491 | 1.108414 | |||
| Query 8 | 0.175357 | 0.572258 | |||
| Query 9 | 0.059674 | 0.088451 | |||
| Query 10 | 0.089215 | 0.322246 |