- From: eric neumann <ekneumann@gmail.com>
- Date: Thu, 26 Mar 2009 12:28:52 -0400
- To: Pat Hayes <phayes@ihmc.us>
- Cc: Bijan Parsia <bparsia@cs.manchester.ac.uk>, W3C HCLSIG hcls <public-semweb-lifesci@w3.org>
- Message-ID: <92e86c7d0903260928v93bc52dv1903aa8c166c72f3@mail.gmail.com>
Pat, Basically I'm in agreement with all of your points, but need to correct some mis-interpretations you made of my comments... On Thu, Mar 26, 2009 at 3:13 AM, Pat Hayes <phayes@ihmc.us> wrote: > > On Mar 25, 2009, at 5:27 PM, eric neumann wrote: > > Several different issues here. > > > > On Wed, Mar 25, 2009 at 5:47 PM, Bijan Parsia <bparsia@cs.manchester.ac.uk > > wrote: > >> Eric, >> >> Thanks for the use case! >> >> On 25 Mar 2009, at 21:31, eric neumann wrote: >> > <snip> > >> >> >> This is the kind of "similar" used in most internal genomic/compound >>> systems... >>> >>> <http://myOrg.com/sw/mxid/PHLP0005> :isIdentifiedwith < >>> http://www.uniprot.org/uniprot/P16233> >>> >> >> Can you explicate this a bit more for me? I.e., could you present what you >> expect this to do or not do? >> > > Certainly... I want look up what myOrg knows about a uniprot protein, but > since they do their own internal data-keeping on things like "druggability" > which aren't included (yet) in uniprot, I need to make sure my extra data is > mapped to the public protein object. > > Does this help you? > > > It doesn't help me. We need to have a 'semantic' answer. What kinds of > thing are being talked about here? What do the URIs refer to? (Records or > chemicals?) Because the use of sameAs depends on the answer to this question > very crucially. > In my company I have a ProteinDictionary table populated will all 'known human proteins' (this is the conceptual part that is easy for all biologists, but is causing some confusion in the thread); each entry is identified (sameAs?) with a protein in Uniprot (as well as a protein in NCBI-Entrez) In ProteinDictionary I include a lot of additional data (not found in Uniprot) on what antibodies exists for that protein (structure) . Therefore, the records "refer" to the same protein, but do not have identical properties My company has more knowledge about the protein, but it is not common to everyone; case of Open World assumptions... Is that clearer? > > (Of course, in a SW world this could have all been done with internal > triples added to the uniprot URI locally...) > > >> >> It really isn't probabilistic anymore since the scientists have all >>> agreed and defined their entry based on some of the info from the public >>> entity; for most situations it is an 'exact mapping' to the referred >>> molecules. >>> >> >> Is it that most, but not all of the time, you can treat is as sameAs but >> sometimes you don't want to? > > > Well, the question we ask of experts like you is: should we are should we > not use owl:sameAs for exact mappings to entities with different records? > > > If your URIs are referring to the entities, then use sameAs when you are > sure you are talking about the same entity, no matter what your records say > about it. If they are referring to the records, then I would guess that > sameAs would be true only when two URIs resolve to the same resource using > GET. > If we all agree we are referring to the protein in question, but the Uniprot and Entrez URIs may have different (hopefully consistent up to open-world assumptions) information. > > >> >> >> I agree owl:sameAs was not intended for this kind of relation, but is is >>> extremely common, and a specialized relation for this would be very much >>> desired. : ) >>> >> >> We need to make me understand the relation :) > > > There are other "identiity" or "similar" relations > > > Braaagh! Semantic alarm! Identity is NOT similarity. Identity really does > mean being EXACTLY the same thing. If A similarTo B, then we are talking > about two things which are similar. If A sameAs B, then we are talking about > ONE THING which happens to have two names. > I did not intend to equate "identity" and "similar"; they usually come up as a bundle in chem and bio discussions like: Does Person A have this exact sequence variant V for Gene G, or something similar but distinct, or is their gene allele completely rearranged (radically altered)? > in mol biology: > > - homolog (symmetric) ; similar function in different species > - paralog (symmetric, sub-property of homolog ) ; similar origin > duplication in same species > - ortholog (symmetric; sub-property of homolog) ; similar function in > different species > > > None of these are identity. > Agree, as was my intent to show forms of similarity, not identity. > (also Ohnology and Xenology, see > http://en.wikipedia.org/wiki/Homology_(biology)) > - variant of (a non-subsumptive form of specialization within genes) > - modified form of (a non-subsumptive form of specialization for protein > gene products), includes splice variants (see > http://www.affymetrix.com/community/publications/affymetrix/tmsplice/index.affx > ) > - similar chem structures (symmetric for compounds) > > > None of these are identity. > Again, Agree... > > One way to use identity here is to try to map the original things to a > 'sort' or 'similarity class' or similarity type' or <choose your own > buzzword>, and then use identity reasoning on these 'types'. So [ A > similar-to B] is glossed as [(similarity-type A) sameAs (similarity-type B)] > but this only takes you so far: you still get transitivity, for example, so > notions like 'very close' don't work this way. Still, it might be one way to > approach the issue. > > > > ... I'm sure there a re dozens more. > > >> >> Remember also, even though these URIs may be of instances in terms of >>> records, >>> >> >> instances of what? > > > For a "collective grouping" of similar instances of (physical) molecules... > d-glucose is 'a' specific molecular structure, but there are over 10^25 of > glucose molecules in a teaspoon of dextrose sweetener.... Not the usual OWL > concept of "instance of class Molecule" is it? > > > This is just a basic ontology issue. You need to distinguish a particular > molecule from a molecular 'pattern' from a class of isomers, etc.., BUt you > can;'t expect OWL to do all this kind of work for you automatically. > Certainly, but how best should we apply OWL so that this can be well represented? Dare we promote meta-classing at this point? I'd rather use OWL to accurately represent "a Molecule Class means this...., and an instance means that ...." whether its structure patterns, property groupings, or mind-conceptual objects ("I can create a specific and novel chemical with this structure and these properties") ...If this discussion is beginning to settle onto a commonly agreed set of principles, I'd like to suggest we capture it and circulate for comment, perhaps through HCLS. cheers, -Eric > Defining 'glucose' as a Class just pushes the definition of Molecule up to > become more akin to a meta-Class... > > > Right, exactly. Classes weren't meant to carry this kind of conceptual > load. You will just have to do some real ontologizing, my friend :-) > > > >> >> the molecule referenced is not really "a specific single molecule" found >>> in nature (conceptually possible, but never thought of this way in may >>> experience). In fact, this is almost always the case in molecular biology >>> (genes, genomes, SNPs, proteins, etc), while when dealing with macro-humans, >>> we can refer to an exact instance in the real world. >>> >> >> We cannot? > > > No one in pharma is interested in mapping URIs to an individual exact, > physical molecule; IP is always around the chemical structure (which IS > unique) rather than the molecule. > > > Good: you have a clear ontology and a clear identity criterion for sameAs. > You are talking about chemical structures. I'd suggest, if you really want > to talk about molecules, having properties has_chemical_structure (domain: > molecule; range; chemstruct) and is_a_molecule_of as its inverse. Don't use > the class structure for Avogadro. > > Pat Hayes > > >> >> Perhaps we really need a set of basic relations (and meta classing?) for >>> this scale of scientific phenomena to keep it distinct from organism >>> examples in clinical studies and experiments... >>> >> >> I suspect there's more weight on "exemplar" than I know how to give at the >> moment :) > > > Well, try keeping a URI tracking a single molecule-- there's no business > value in that! ; ) > > Eric > > >> >> >> Cheers, >> Bijan. >> > > > ------------------------------------------------------------ > IHMC (850)434 8903 or (650)494 3973 > 40 South Alcaniz St. (850)202 4416 office > Pensacola (850)202 4440 fax > FL 32502 (850)291 0667 mobile > phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes > > > > >
Received on Thursday, 26 March 2009 16:29:30 UTC