Re: The UniProt database in RDF format from Eric Jain on 2004-07-22 (public-semweb-lifesci@w3.org from July 2004)

From: Eric Jain <Eric.Jain@isb-sib.ch>
Date: Thu, 22 Jul 2004 09:08:39 +0200
To: John Wilbanks <wilbanks@w3.org>
Cc: public-semweb-lifesci@w3.org
Message-ID: <40FF67F7.3020309@isb-sib.ch>

John Wilbanks wrote:
> what do you mean by "a  mechanism for grouping together
 > a set of statements in a file would be welcome"

The problem here is that we do not manage data on the level of 
individual statements or resources, but by protein, for example. A 
protein may be described by several resources and many statements. Some 
resources may be described in detail in a different data set, and 
therefore only need to be referenced. Other resources are specific to a 
protein, and therefore need to be stored along with any other data on 
the protein.

While most people are happy with being able to retrieve data for 
individual proteins from a web server, some need to download the 
complete data set. As there are more than a million proteins, 
distributing the data in separate files, one per protein, is not 
practical (couldn't find any implementation of zip/unzip that could 
handle this :-). But if all data is merged into one file, it is no 
longer trivial to reconstruct the original sets of statements.

Note that TriX introduces a solution for "grouping statements" with help 
of a "graph" element. Jena on the other hand has the concept of "models".

Received on Thursday, 22 July 2004 03:10:12 UTC