- From: Alan Rector <rector@cs.man.ac.uk>
- Date: Sat, 8 Jul 2006 20:57:15 +0200
- To: William Bug <William.Bug@DrexelMed.edu>
- Cc: "Miller, Michael D (Rosetta)" <Michael_Miller@Rosettabio.com>, Tim Clark <twclark@nmr.mgh.harvard.edu>, w3c semweb hcls <public-semweb-lifesci@w3.org>, SWAN Team <swan-team@mind-informatics.org>, Trish Whetzel <whetzel@pcbi.upenn.edu>, chris mungall <cjm@fruitfly.org>
On 6 Jul 2006, at 19:22, William Bug wrote: > > 2) Doesn't this lead down a road similar to that of MIAME, only > now you've shifted the border of incommensurateness beyond the > level for data format and into the semantic domain? Yes, but put another way, you have refactored the problem of "incommensurateness" into two more tractable pieces - one about the data structures to convey meaning, the other about the meanings conveyed. You have also removed the risk of conflating the two problems thereby making both harder. The UML/XML models are about conveying meanings; the ontologies are about the meanings conveyed. The constraints in the UML/XML models ensure that software can process the data structures correctly. Violating such a constraint means that the structure is invalid. The constraints in the ontology are about what we understand about the biology. Violating a constraint in the ontology means that the meaning is incorrect or even inconsistent. Getting that relationship between the data structures and meanings clearly defined is a key issue for many standardisation efforts. In practice, the ontologies/terminologies/vocabularies are often maintained by different groups than the data structures/exchange formats and there are often requirements to use the same exchange format with different ontologies/terminologies and vice versa. (Analogous problems are common in the medical community.) However, factoring the problem in this way does mean that you don't get full interoperability unless you agree on _both_ the data structures/exchange formats and the ontologies/terminologies. (Or define mappings and equivalences between them) > What I mean is, won't there still be difficulty determining even > approximate semantic equivalency for all of the details of data > provenance - many of which absolutely must be resolved in order to > perform large-scale re-pooling of related observations made in the > context of different studies - even if nearly identical assays/ > instruments/reagents are used? Yes. There will always be a trade-off between the grounding cost of agreeing up front to use the same standards and the clean-up cost of resolving the differences later. You can choose whether to pay for your lunch in advance or afterwards, but there is no free lunch. It is a choice for the community - or communities - on how wide a consensus on the various issues they can achieve. Regards Alan ----------------------- Alan Rector Professor of Medical Informatics School of Computer Science University of Manchester Manchester M13 9PL, UK TEL +44 (0) 161 275 6149/6188 FAX +44 (0) 161 275 6204 www.cs.man.ac.uk/mig www.clinical-esciences.org www.co-ode.org
Received on Sunday, 9 July 2006 11:10:14 UTC