- From: Phillip Lord <phillip.lord@newcastle.ac.uk>
- Date: Tue, 24 Mar 2009 12:10:40 +0000
- To: Oliver Ruebenacker <curoli@gmail.com>
- Cc: Michel_Dumontier <Michel_Dumontier@carleton.ca>, David Booth <david@dbooth.org>, W3C HCLSIG hcls <public-semweb-lifesci@w3.org>
Oliver Ruebenacker <curoli@gmail.com> writes: > 2009/3/23 Michel_Dumontier <Michel_Dumontier@carleton.ca>: >> I do not think this would be a wise "simplification". This is only a >> simplification from one perspective: because it avoids having to mint >> and maintain pairs of URIs instead of a single URI. But the downstream >> cost is that it creates an ambiguity (or "URI collision") >> http://www.w3.org/TR/webarch/#URI-collision >> that may cause trouble and be difficult to untangle later as the data is >> used in more and more ways. For example, if any of the same predicates >> need to be used on both the record and the molecular entity, they will >> become hopelessly confused. Also, if disjointness assertions are >> included then this overloading may cause logical contraditions. > > Can any one name a real world example of where confusion between an > entity and its record was issue? Yes, sure. All proteins have a Uniprot ID (conflating protein and uniprot records). Then we integrate this with drugbank; this represents many things including proteins which are not in Uniprot, or represents several proteins where Uniprot has one. Consider insulin for instance. We now have a problem because not all proteins have a Uniprot ID. The flip side is that if you always say Protein Record --> contains knowledge about --> protein it's much more complicated. You are making your data model more difficult to work with all of the time, to cope with edge cases which occur only some of the time. There's no way around this; either way it's a compromise and what is good in one context may not be good in another. Phil
Received on Tuesday, 24 March 2009 12:11:40 UTC