- From: <Eric.Neumann@aventis.com>
- Date: Mon, 9 Aug 2004 11:07:55 -0400
- To: <public-semweb-lifesci@w3.org>
Cross-posting this message... -----Original Message----- From: biopax-discuss-bounces@biopax.org [mailto:biopax-discuss-bounces@biopax.org]On Behalf Of Eric.Neumann@aventis.com Sent: Monday, August 09, 2004 11:03 AM To: pm286@cam.ac.uk; biopax-discuss@biopax.org Subject: RE: [BioPAX-discuss] RE: xml schema for BioPAX This is a great discussion, and it may have an impact on other areas of life science data exchange. I think the relation between OWL/RDF and XML Schema is a critical one for us all to better comprehend. The key question, I believe, is around the need for expressivity: how many descriptors and relations do I need to use to describe just one, context-specific pathway in mouse, say WNT4? How do I ensure to get my information (and points-of-view) across to another researcher? If I can regulary pack the data/info into something similar to a microarray data set, then why not define an xml schema and use vanilla xml? If I need to add layers of relations to pathway elements, regulators, modulators, conditions, then I'd rather use RDF for the instance data. It parses just fine into triples with not too much depth. FWIW, I have found using RDF (graph) for data instances way easier than the syntax restrictions (vs. tree) within a xml-schema. For those not comfortable in "processing" RDF (don't base your opinion on trying reading RDF by eye), I suggest trying out JENA or CWM to see what is possible in this space. Quoting a friend from the Whitehead, "Once you've experienced XML hell, you'll understand". Eric -----Original Message----- From: biopax-discuss-bounces@biopax.org [mailto:biopax-discuss-bounces@biopax.org]On Behalf Of Peter Murray-Rust Sent: Monday, August 09, 2004 7:37 AM To: biopax-discuss@biopax.org Subject: RE: [BioPAX-discuss] RE: xml schema for BioPAX At 17:06 06/08/2004 -0400, Gary Bader wrote: >Hi Chris, > That is correct. There is no XML Schema for BioPAX, only an OWL >definition. Both OWL and XML Schema are XML standards for representing >information recommended by the W3C. The main difference between XML Schema >and OWL is that OWL allows definition of a class hierarchy, where XML Schema >does not. OWL has some other unique features as well compared to XML Schema >(e.g. ability to say that one class is disjoint from another), but BioPAX >does not make use of those. This means that XML Schema tools, like Castor >and JAXB will not work with OWL, but the Jena library replicates much of >this functionality, just in a different manner. > The choice of using OWL was decided by a vote in the core group >early on in BioPAX discussions. > >Best, >Gary CML (Chemical Markup Language) is part of the BioPAX system and is firmly based on XSD Schema. I don't see XSD and OWL as being exclusive and hope that they will interoperate. Indeed I am keen to see how RDF/OWL might be "layered" on CML - there is a lot of validation that cannot be provided by Schemas. CML represents a set of (hopefully) well-understood information objects for which much semantics depends on algorithms. Thus to calculate the frequencies of a transition state a matrix needs to be inverted and it is more practical to map this onto Java classes. We have developed about 100 schema elements (not all are required by BioPAX) and these are transformed algorithmically into Java (we actually wrote our own, rather than using JAXB, Castor, etc. as we also have to generate FORTRAN, Python and C++). The functionality of a schema is mainly get and set, so we have also handcrafted a set of Tools which wrap the schema objects and provide a large set of chemical functions. An example (paraphrased) might be: MoleculeTool mt = new MoleculeTool(molecule); AtomSetTool[] rings = mt.getRingNuclei(); (These tools are available as Open Source - http://wwmm.ch.cam.ac.uk/moin) Note - CML now includes CMLReact which has been extensively tested on enzyme reactions (by Gemma Holliday) and which may be of interest in BioPAX wants to hold details of reactants, mechanisms, transition states, etc. However there are many cases where it would be useful to reason. Examples can be: "the formula deduced from the connection table should be consistent with that reported by the depositor" "The mass and charge difference in a reaction should be zero" It looks attractive to model these by OWL, but it may need to use primitives to call CML algorithmic functionality. Does this look a useful and practical approach. Perhaps RDF can be used to locate resources which apply these functions P. Peter Murray-Rust Unilever Centre for Molecular Informatics Chemistry Department, Cambridge University Lensfield Road, CAMBRIDGE, CB2 1EW, UK Tel: +44-1223-763069 _______________________________________________ BioPAX-discuss mailing list BioPAX-discuss@biopax.org http://www.biopax.org/mailman/listinfo/biopax-discuss _______________________________________________ BioPAX-discuss mailing list BioPAX-discuss@biopax.org http://www.biopax.org/mailman/listinfo/biopax-discuss
Received on Monday, 9 August 2004 15:08:32 UTC