- From: Alan Ruttenberg <alanruttenberg@gmail.com>
- Date: Fri, 20 Jul 2007 01:15:43 -0400
- To: Eric Jain <Eric.Jain@isb-sib.ch>
- Cc: Chris Mungall <cjm@fruitfly.org>, Bijan Parsia <bparsia@cs.man.ac.uk>, public-semweb-lifesci hcls <public-semweb-lifesci@w3.org>, Darren Natale <dan5@georgetown.edu>
On Jul 19, 2007, at 4:16 AM, Eric Jain wrote: > Alan Ruttenberg wrote: >> In that case, I would recommend that it is unwise to use Uniprot >> ids as identifiers of protein classes on the semantic web. Doing >> so would encourage exactly the kind of ambiguity that we need to >> avoid in order to write statements that will not confuse semantic >> web agents (including people). > > The question you need to ask yourself here is whether there really > are such things as specific proteins, or if this is always just a > useful abstraction (and often a fuzzy one at that, if it wants to > make sense for biologists). There's something odd about this statement. So let me try to rephrase in a way which hopefully makes it clearer how I am thinking. I consider a specific protein to be an instance of a molecule - some very tiny piece of stuff composed of a bunch of atoms bound together. So yes, I really believe that there are things such as specific proteins. Then there are protein classes, which identify some set of those instances. Those protein classes can be defined in a variety of ways. Some of those ways will be such that a protein might be an instance of more than one of these classes. When you are saying "specific proteins" I think you are actually talking about there being something like there being one "true" disjoint and covering set of classes into which each protein can be placed. The answer to that question would be I don't know and I'm not sure whether I care. What I really care about is being able to specify what sets of things I am making general statements about, having a way to evaluate whether or not *I* believe them to be true or consistent with other statements, and to then encode them in such a way that my computer can help me work with a large number of such statements to help make progress on some scientific problem. > UniProt has a different idea on what exactly the protein-related > entities are than e.g. EMBL_CDS, and others have different ideas, > too. Even if you came up with your own protein database that is > more suitable for Semantic Web applications because it has better > explicit definitions than UniProt manages to have at the moment, I > could argue that what you have in the end are nothing but "records, > too... You could argue that, but I'm not sure that it would be very illuminating. The difference between the Uniprot records and the records that I want to use is that they are used by different sorts of computer programs. In the one case the computer can evaluate the contents of the record in such a way as to check consistency, compute entailments etc. In the other not. If I have the source code for some program, I can certainly say that it can be considered a string. But saying a string doesn't capture the fact that it can be interpreted in a certain way to control a computation, and I would give that entity a different name and type than those that I would give to a string that was not interpretable in that way. All of what we manipulate on computers are in some way records/ strings of bits. However saying that doesn't really capture what we need to understand the consequences of what we and the computers are doing. When I digitally sign a contract, and I later breach it, I can be sued. Do you not think that there is something associated with those particular bits is different sort of thing than the something associated with the bits behind http://www.miniclip.com/games/sling/ en/ ? -Alan
Received on Friday, 20 July 2007 05:16:14 UTC