RE: URI thoughts

> 1) Dereferencing: The dereferencing of a URI to a data record 
> results in the return of all the "authority managed" 
> information about it (locally curated data) in the form of a 
> RDF graph. Outside annotations would not be included unless 
> the authority provided an open annotative service. This is 
> what you get back when you query sources such as NCBI or EBI.

I am not sure what the "authority" here means.  RDF itself is monotonic and
open.  Hence, anyone can say anything about anything.  In the eyes of RDF,
there is only the problem of model consistency and an RDF engine can not
consider one assertion is "more" correct than others.   

> 2) Versioning: A few useful pieces of metadata for changeable 
> (mutable) URI-referenced RDF graphs (dereferenced) is what 
> version is current, when it was assigned or created (date and 
> time, UTC), and a reference to the sorted list of all earlier 
> versions. This would allow precise rolling back to any 
> version for performing a re-analysis of info from an earlier time.

I think Dublin Core's relation element and associated element refinement
like dc:replaces and dc:isReplacedBy etc., would handle this adequately. 

> 3) Signifiers: Life science data records of bio or chem 
> entities (genes, snps, protein, chemicals, agents, diseases, 
> pathways, anatomical parts) should always reference a 
> community agreed upon conceptualized bio/chem-entity, i.e., 
> to what the scientist in his or her mind commonly and 
> collectively regard when hearing "human GSK3 beta". These 
> could have ontologies layered on them when they become 
> available. These entities represent the 'signifiers or signs' 
> for the 'signified or real-world objects' such as "Hu GSK3b" 
> or " Mus MAP12"  
> (for the curious, see http://en.wikipedia.org/wiki/Sign_(semiotics),
> btw the full RDF graph around an entity would be equivalent 
> to Peirce's 'interpretant'). They would exist as non-data 
> objects, more like scientific placeholders, but can use 
> rdfs:seeAlso to point to real data records of them. Data 
> records by themselves WOULD NOT be of this special 
> meta-class. If this sounds fuzzy to you, consider what it 
> took to align most of the gene synonym names to one agreed 
> symbol; sociologically this is no different.

I can't agree more.  We should not mixup the data/description about a
resource with the resource itself.  This is the reason why I have strongly
opposed the idea of using wiki URI to represent biological entities.
Information and non-information resource are disjoint.  Mixing them up will
break the foundation of web and of course the logic of an RDF engine. 

> 4)  Covering Mapping: Propose an initial set of properties to 
> support the above model. As a starter, define an equivalent 
> of rdfs:isDefinedBy for life science that would specifically 
> map an instance graph of the data record to the singular 
> conceptualized bio/chem-entity, using something on the order 
> of  hcls:isDefinedAs :
> 
> <http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? 
> db=gene&cmd=Retrieve&list_uids=2932>  <hcls:isDefinedAs>   
> <http://purl.org/hcls/bioentity/hu_gsk3b>
> 
> In line with what Chimezie proposed, rdfs:seeAlso could be 
> used to declare the inverse relation for a select set of data 
> records; not sure if any new relation is needed here.

I think such sets of vocabulary is needed.  But rdfs:seeAlso etc. is refined
to be an AnnotationProperty in OWL so it can not be extended anymore.  Some
simple property like
hcls:nchientry will just do in my opinion.  As a start, I think such kind of
property should be very coarse grained.  Because the more general, the more
sharable.
 
Xiaoshu

Received on Tuesday, 20 June 2006 14:45:29 UTC