- From: Dave Reynolds <der@hplb.hpl.hp.com>
- Date: Fri, 20 Aug 2004 16:51:34 +0100
- To: Eric Jain <Eric.Jain@isb-sib.ch>
- Cc: Massimo Marchiori <massimo@w3.org>, public-semweb-lifesci@w3.org
Eric Jain wrote: >> 3. Interesting: do your use cases have absolute needs >> for reification, or it's just a convenience? Is >> your only use just use case 6 (provenance)? > > > Consider this example: A protein may occur in one or more organisms. We > may need to indicate who observed this protein in a specific organism, > and cite a relevant publication etc. This information obviously can't be > attached to either the protein or the taxon resource. We could create > intermediary resources for connecting proteins to taxa, but this seems > unnatural and is impractical, because the same procedure would have to > be repeated for many other properties. Also, no application should break > because one day we decide to provide some provenance data for something > that previously never had any. A good use case for provenance information. You could represent this by explicitly reifying the observation rather than the RDF statement. For example, have an ObservationEvent class, instances of which indicate the protein, the organism, the observer, the citation etc directly. This could sit alongside rather than replace a direct link between protein and organism so that the presence or absence of such information does not change the navigation structure. This doesn't really save anything compared to attaching properties to a reified statement but it is another option. > By quads I meant (perhaps misusing the terminology) that when parsing > something like > > <rdf:Description rdf:about="P12345"> > <name rdf:ID="S1">Foo</name> > </rdf:Description> > > with a statement-by-statement callback mechanism, most parsers will return: > > P12345 name 'Foo' > S1 rdf:type rdf:Statement > S1 rdf:subject P12345 > S1 rdf:predicate name > S1 rdf:object 'Foo' > > Rather than: > > S1: P12345 name 'Foo' > > Which would be much simpler and more efficient to process, in my opinion. The efficiency does depend on the platform. In Jena we tried to get some way towards the efficiency and simplicity of the latter while still supporting the standard. The stores can (and in the RDB case, do) store the quad of statements compactly. The API allows you to optionally hide the reification quads so that your model isn't cluttered up with the extra statements yet you can still use the reification API to get from a Statement to the resource representing its reification. This is a complex part of the implementation to maintain and appears to be so little used in practice that there is a proposal to deprecate it [1]. Perhaps your experience suggests we should consider at least postponing the deprecation a little longer in case someone starts to need this functionality. Dave [1] http://groups.yahoo.com/group/jena-dev/message/8523
Received on Friday, 20 August 2004 15:52:01 UTC