- From: Ian Wilson <Ian.Wilson@uchsc.edu>
- Date: Wed, 08 Feb 2006 15:58:21 -0700
- To: Jim Hendler <hendler@cs.umd.edu>
- CC: Susie Stephens <susie.stephens@oracle.com>, public-semweb-lifesci@w3.org
Jim Hendler said the following on 2/8/2006 2:29 PM:
> I love this idea, but I would go a bit further - be even nicer for us
> non-biologists if it also included some example queries to run (and
> maybe even the correct answer sets) - I think if that existed, we could
> push some of the triple store developers to use it as a benchmark, which
> would help both communities...
>
Agreed. The Oracle paper provided an outline for 6 different
queries - which is a good starting point. It would be ideal to
incorporate all of this into a test harness though. Similar
efforts are underway at the SIMILE project, that I have been
loosely involved with through Vineet Sinha.
http://simile.mit.edu/repository/shootout/trunk/shootout/
http://simile.mit.edu/repository/shootout/trunk/shootout-core/
Another similar project, that I haven't seen mentioned before,
but found useful, is here:
http://tripletest.sourceforge.net/
For anyone that has not read the Oracle paper, I copied their
query table into an ASCII friendly format below:
Description | Pattern | Projection | Result | limit
---------------------------------------------------
Q1: Display the ranges of
transmembrane regions
6 triples
5 vars
3 vars
15000 rows
Q2: List proteins with
publications by authors
with matching names
5 triples
5 vars
1 LIKE pred.
3 vars
10 rows
Q3: Count the number of
times a publication by a
specific author is cited
3 triples
2 vars
0 vars
32 rows
Q4: List resources that
are related to proteins
annotated with a specific
keyword
3 triples
2 vars
1 var
3000 rows
Q5: List genes associated
with human diseases
7 triples
5 vars
3 vars
750 rows
Q6: List recently
modified entries
2 triples
2 vars
1 range pred.
2 vars
8000 rows
---------------------------------------
Q1 (the only actual query provided)
---------------------------------------
SELECT AVG(LENGTH(protein)), AVG(LENGTH(begin)),
AVG(LENGTH(end))
FROM TABLE(RDF_MATCH(
‘(?p rdf:type up:Protein)
(?p up:annotation ?a)
(?a rdf:type
up:Transmembrane_Annotation)
(?a up:range ?range)
(?range up:begin ?begin)
(?range up:end ?end)’
RDFModels('UniProt'), NULL, NULL))
WHERE rownum <= 15000;
Ian
Received on Wednesday, 8 February 2006 22:59:51 UTC