Re: Oracle Uniprot RDF data set and benchmarks

At 9:26 -0500 2/8/06, Susie Stephens wrote:
>I will find out more about the Uniprot subgraph that we used for the 
>VLDB paper, and see if we can make it available.
>
>However, I really like Eric Jain's offer of providing stable data 
>sets of different sizes for benchmarking. It makes sense to me to 
>have an independent organization providing the data sets.
>
>Susie
>
>
>

I love this idea, but I would go a bit further - be even nicer for us 
non-biologists if it also included some example queries to run (and 
maybe even the correct answer sets) - I think if that existed, we 
could push some of the triple store developers to use it as a 
benchmark, which would help both communities...

>
>
>
>Eric Miller wrote:
>
>>
>>
>>  On Feb 8, 2006, at 6:22 AM, Eric Jain wrote:
>>
>>>
>>>  Ian Wilson wrote:
>>>
>>>>  We will thus want to maintain a local copy of this extract (on 
>>>>the  wiki?) so changes in the graph don't change the benchmarking 
>>>>results.
>>>
>>>
>>>  The data in http://www.isb-sib.ch/~ejain/rdf/data/ is indeed 
>>>updated every two weeks, but I could also provide some more stable 
>>>data sets for benchmarking if there is interest, perhaps with 1M, 
>>>10M and 100M triples?
>>
>>
>>  I think this would be extremely useful for a variety of 
>>communities  trying to assess issues of scalability; the more 
>>"connected" graphs  subsets for testing, the better.
>>
>>  thanks in advance!
>>
>>  -- eric miller                              http://www.w3.org/people/em/
>>  semantic web activity lead               http://www.w3.org/2001/sw/
>>  w3c world wide web consortium            http://www.w3.org/
>>
>>
>>

-- 
Professor James Hendler			  Director
Joint Institute for Knowledge Discovery	  	  301-405-2696
UMIACS, Univ of Maryland			  301-314-9734 (Fax)
College Park, MD 20742	 		  http://www.cs.umd.edu/~hendler
Web Log: http://www.mindswap.org/blog/author/hendler

Received on Wednesday, 8 February 2006 21:30:06 UTC