Re: Oracle Uniprot RDF data set and benchmarks from Ian Wilson on 2006-02-08 (public-semweb-lifesci@w3.org from February 2006)

From: Ian Wilson <Ian.Wilson@uchsc.edu>
Date: Wed, 08 Feb 2006 15:34:58 -0700
To: Eric Jain <Eric.Jain@isb-sib.ch>
CC: public-semweb-lifesci@w3.org
Message-ID: <43EA7212.1000005@uchsc.edu>

Eric Jain said the following on 2/8/2006 4:22 AM:
> Ian Wilson wrote:
>> We will thus want to maintain a local copy of this extract (on the 
>> wiki?) so changes in the graph don't change the benchmarking results.
> 
> The data in http://www.isb-sib.ch/~ejain/rdf/data/ is indeed updated 
> every two weeks, but I could also provide some more stable data sets for 
> benchmarking if there is interest, perhaps with 1M, 10M and 100M triples?
> 

Eric,

That would be great. Since these graphs change over time, do 
archived annual snapshots make sense? Any thoughts on how you 
might derive these subgraphs? You are likely more familiar 
than anyone else with these graphs.

The separate graphs you provide for distribution are already 
nicely divided (e.g. Taxonomy, Sequence, GO, etc.) - so, I was 
thinking for benchmarking purposes, it would be nice to have 
these graphs incorporated into the subgraphs defined for 
benchmarking.

Thanks again for the offer, and this wonderful resource you 
have been maintaining.

Best,
Ian

Received on Wednesday, 8 February 2006 22:36:30 UTC