RE: Chemistry and the Semantic Web from Eric.Neumann@aventis.com on 2004-06-30 (public-semweb-lifesci@w3.org from June 2004)

From: <Eric.Neumann@aventis.com>
Date: Wed, 30 Jun 2004 14:45:09 -0400
To: <Eric.Jain@isb-sib.ch>
Cc: <public-semweb-lifesci@w3.org>
Message-ID: <81BE96441EAC744B8F6E1E738C53F9589E14BB@sccsmxsusr05.pharma.aventis.com>

Eric,

From my understanding of XPaths (http://www.w3.org/TR/xpath#section-Introduction), they can be used "within" URIs. So mapping an RDF statement to a specific Chem-XML node or group should be doable. Am I mis-informed, or is this part of the standard W3C model? Of course, I'm not sure how this would look appended to a LSID URN: 
"urn:lsid:www.cas.org:CA:33069-62-4#feature=21" ???

Eric

Eric Neumann, Ph.D.

Global Head of Knowledge Management
Aventis - DI&A
Tel:   908-231-3510
Fax:  908-231-3307
Eric.Neumann@Aventis.com

-----Original Message-----
From: Eric Jain [mailto:Eric.Jain@isb-sib.ch]
Sent: Wednesday, June 30, 2004 3:13 AM
To: Neumann, Eric PH/US
Cc: public-semweb-lifesci@w3.org
Subject: Re: Chemistry and the Semantic Web

Eric.Neumann@aventis.com wrote:
> Question: How would one apply RDF for such cases? Would one use CML 
> (chemical markup language) to describe the chemical structure and have 
> an RDF statement refer to part of that doc via XPath/XPointers? How 
> about other structural formats like SMILE and CHUCKLES? Would the 
> documents be referenced using an LSID mechanism? Could this become the 
> basis for allowing research findings around chemistry and assays to 
> become consolidated as part of a R&D knowledge base?

This is an interesting question, and certainly also relevant to any 
classical bioinformatics data sources that contain more quantitative 
than qualitative data (e.g. 3D structures, 2D gel images and microarray 
data). I don't really have any solutions, just some ideas:

In those cases where it is possible to embed identifiers in the data, 
these could be referenced with identifiers such as 
urn:lsid:foo.org:bar:10. A resolution server can then be set up to 
extract the referenced data when required. Note that the original format 
need not contain full LSIDs.

If embedding identifiers is not an option, you could keep an 
LSID-to-XPath mapping on the resolution server. Using XPath statements 
directly as resource identifiers doesn't seem practical, though I may be 
wrong.

In any case, you may end up duplicating parts of the non-RDF data into 
RDF, as RDF-aware applications (e.g. inference engines) are usually not 
able to make direct use of anything that is not RDF...

Received on Wednesday, 30 June 2004 14:56:24 UTC