- From: eric neumann <ekneumann@gmail.com>
- Date: Thu, 26 Mar 2009 12:35:23 -0400
- To: "John F. Madden" <john.madden@duke.edu>
- Cc: Pat Hayes <phayes@ihmc.us>, Michel_Dumontier <Michel_Dumontier@carleton.ca>, W3C HCLSIG hcls <public-semweb-lifesci@w3.org>
- Message-ID: <92e86c7d0903260935l1c6aff76nbbb214d14d349f26@mail.gmail.com>
+1 On Thu, Mar 26, 2009 at 12:31 PM, John F. Madden <john.madden@duke.edu>wrote: > Pat et al., > > It sounds like people sometimes have an irresistible itch to say that "A is > similar to B", but this statement as such has very little semantic content. > > Perhaps it's not really intended as a statement that has a truth value, but > rather as a record of somebody's feelings. > > The semantic web can certainly serve as a repository for recording one's > feelings, and this might even be useful. (If I were and astrophysicist, for > example, I might be quite interested in Stephen Hawking's intuitions about > some problem I was working on.) > > So what would you say about an rdf:property called, say, " > http://www.example.com/intuit#similarTo" that could be used simply to post > a record that somebody intuited a "similarity" between two things? > > It would have little utility for inferencing, unless one were to write a > custom application (i.e. not OWL) to do so. But it might have utility as a > semantic web "bookmark" for relationships that could be interesting > candidates for future formalization. > > John > > > > > On Mar 26, 2009, at 8:42 AM, Pat Hayes wrote: > > >> On Mar 26, 2009, at 8:28 AM, Michel_Dumontier wrote: >> >> Pursuant to my email, and in light of several other comments, if our >>> goal is to now rectify what Uniprot:Protein _actually_ means in our >>> domain, and how it can be semantically mapped to other bio-ontologies, >>> then I might also suggest that instances of Uniprot:Protein are >>> aggregates of proteins (err... :ProteinAggregate anyone?), possibly >>> separated by both space and time, having a similar (base sequence + >>> mutations / ptms) composition, sharing certain characteristics (e.g. >>> functionality, domains) and observed to participate in biological >>> processes. Clearly not a type of protein of the single molecule form, >>> but again, certainly not a Record. >>> >> >> Indeed. If I might make a suggestion, rather than talking about >> 'aggregates' (which sounds disturbingly, er, philosophical), why not just >> say that the entity being identified is a _substance_. Substances are 'kinds >> of stuff' that include mixtures (eg concrete is a kind of stuff comprising a >> mix of sand, crushed rock, cement and water in several possible proportions) >> but also 'pure' stuffs such as water. Note the distinction between a >> substance and a piece of the substance (concrete, the building material vs,. >> this or that lump of concrete) or a mereological sum (your 'aggregate', I >> think) of such pieces (all the concrete in America). The utility of this is >> that it eliminates the discussions about molecules, which I think is getting >> in the way of clarity here. Regarding sameAs, being the same substance is a >> very strict kind of sameAs, of course, but it really does only refer to >> substances, which is a step in the right direction. Each protein is a >> substance. It might turn out that one protein is a mixture of others, for >> example: this is fine, nothing breaks, as long as nobody says the mixture is >> sameAs one of its components. And now one can have notions such as 'purified >> form of' or 'isotopic version of' between substances, which might help to >> make all these distinctions that you chemists need to be concerned with. >> >> Distinctions like object/substance/piece/mixture were worked out by >> ontologists over 20 years ago, by the way. None of this is rocket science. >> >> Pat >> >> >> >>> -=Michel=- >>> >>> >>> >>> >>>> If however, what we've been talking about is that identifiers like >>>> http://purl.uniprot.org/uniprot/Q16665 >>>> >>>> are actually database records, and not molecular entities, then we can >>>> settle this quickly: >>>> >>>> Uniprot RDF file: http://www.uniprot.org/uniprot/Q16665.rdf >>>> (is this what people were referring to as a Record???) >>>> >>>> Contains: >>>> >>>> <rdf:Description rdf:about="http://purl.uniprot.org/uniprot/Q16665"> >>>> <rdf:type rdf:resource="http://purl.uniprot.org/core/Protein" /> >>>> >>>> >>>> It's clear that the entity denoted by :Q16665 is rdf:type :Protein and >>>> is the subject of statements that are biological in nature such as >>>> being >>>> located in sub-cellular compartments or being involved in biochemical >>>> reactions. It is clearly not a Record. This is generally the case for >>>> nearly all entries in biomolecular databases. >>>> >>>> Cheers, >>>> >>>> -=Michel=- >>>> >>>> Anxiously waiting see if this clears up things or generates >>>> >>> controversy >>> >>>> .. it's hard to predict! >>>> >>>> >>>> >>>> If nobody ever wants to use the same property to talk about the >>>>> database >>>>> record as was used to talk about the molecule, and nobody ever makes >>>>> >>>> an >>>> >>>>> assertion that implies that the class of database records is >>>>> >>>> disjoint >>> >>>> from the class of molecules, then I don't see any harm in using the >>>>> same >>>>> URI to ambiguously denote both. But if one is trying to design >>>>> >>>> data >>> >>>> to >>>>> be reusable by others in unforeseen ways, there clearly *is* a risk >>>>> that >>>>> someone will want to make such assertions in conjunction with the >>>>> >>>> data, >>>> >>>>> and if that happens there is a clear harm. This risk is easy to >>>>> >>>> avoid >>>> >>>>> by using separate URIs. >>>>> >>>>> There *are* trade-offs. Minting two URIs instead of one *does* add >>>>> some >>>>> complexity, though as I pointed out that additional complexity can >>>>> >>>> be >>> >>>> mitigated to the point that it is a *very* low cost. Still, >>>>> >>>> different >>>> >>>>> people will weigh these trade-offs differently, and what's best for >>>>> >>>> one >>>> >>>>> situation may not be best for another, as I indicated in my original >>>>> post. >>>>> >>>>> Furthermore, even if one does use the same URI to ambiguously denote >>>>> both a database record and a molecule, that is not the end of the >>>>> >>>> world >>>> >>>>> either. It is possible (though more difficult) to later separate >>>>> >>>> out >>> >>>> and relate the different senses of an ambiguous URI, as I have >>>>> described: >>>>> http://dbooth.org/2007/splitting/ >>>>> Ambiguity is inescapable, and ambiguity between a thing and a page >>>>> >>>> that >>>> >>>>> describes that thing is not fundamentally different from other kinds >>>>> >>>> of >>>> >>>>> ambiguity (except perhaps that we are aware of it in advance and it >>>>> >>>> can >>>> >>>>> be easily avoided), as explained here: >>>>> http://dbooth.org/2007/splitting/#httpRange-14 >>>>> >>>>> Finally, although it is flattering that you have named this >>>>> >>>> suggestion >>>> >>>>> after me, I cannot take credit. As I pointed out in my original >>>>> >>>> post, >>>> >>>>> the suggestion to differentiate between a molecule and the database >>>>> record that describes that molecule originates with the Architecture >>>>> >>>> of >>>> >>>>> the World Wide Web: >>>>> http://www.w3.org/TR/webarch/#URI-collision >>>>> and best practices for implementing this distinction are described >>>>> >>>> in >>> >>>> Cool URIs for the Semantic Web: >>>>> http://www.w3.org/TR/cooluris >>>>> >>>>> David Booth >>>>> >>>>> >>>>> >>>> >>> >>> >>> >>> >> ------------------------------------------------------------ >> IHMC (850)434 8903 or (650)494 3973 >> 40 South Alcaniz St. (850)202 4416 office >> Pensacola (850)202 4440 fax >> FL 32502 (850)291 0667 mobile >> phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes >> >> >> >> >> >> >> > >
Received on Thursday, 26 March 2009 16:36:05 UTC