Re: Ambiguous names. was: Re: URL +1, LSID -1

Eric Neumann ha scritto:
> Alan,
>
> the life science community has for years applied an implicit 
> transitivity to records of things, so that when many say:
>
> "http://purl.uniprot.org/uniprot/P12345 is expressed only in species 
> homo sapien"  
>
> they usually imply that "the protein referenced by 
> datarecord:http://purl.uniprot.org/uniprot/P12345 is expressed only in 
> species homo sapien"
>
> I am not arguing for or against this "short-cut", but it is what it 
> is, and certainly can be handled adding certain logic rules to dealing 
> with datarecords and their content.
shortcuts can be fine as long as people are aware of it what exactly the 
shortcut stands for. Eric's email from earlier this morning stated that 
there is no difference, which is different from agreeing on a shortcut.

that aside, even the extended version contains several shortcuts and 
implicit information.
- by the maths foundation of databases, 
"datarecord:http://purl.uniprot.org/uniprot/P12345" refers to an instance.
- by the remainder of the sentence above mentioning "homo sapiens", this 
refers to a type (leaving out discussion about the ontological status of 
"species), suggesting thereby that "the protein" is not a particular 
protein, but protein at the class/universal level.
- then, to make the natural language sentence a coherent piece, 
"datarecord:http://purl.uniprot.org/uniprot/P12345" should contain data 
at the type-level, which is does - what the user finds out upon manual 
inspection.
now, I easily can come up with another natural language rendering and 
'implicit transitivity': "the protein referenced by 
datarecord:http://purl.uniprot.org/uniprot/P12345 is expressed only in 
Joe Soap", which changes the meaning entirely, in that there is some 
particular protein expresses in some individual.

in my computer scientist mode, neither do I want to manually browse to 
the URL and read the record nor do NLP on any part-of-speech that 
contains "http://purl.uniprot.org/uniprot/P12345" to figure this out. 
Whatever the identifier system (Scott Marshall has a nice list of 
requirements & points in today's emails on the topic), the vagaries of 
above-suggested 'implicit transitivity' would need to be spelled out 
explicitly so that any other software wanting to use the ID system can 
recognise the records for what they (are supposed to) represent.

best regards,
marijke

>
> Consider that it may be impossible to change the non-software part of 
> the LS community on how they think about records vs. conceptual 
> entities that exist in the real-world (non-IR).
>
> Eric
>
>
> On Jul 16, 2007, at 12:45 AM, Alan Ruttenberg wrote:
>
>>
>> On Jul 15, 2007, at 1:53 PM, Eric Jain wrote:
>>
>>>> Yes, but what sorts of statements can be made using 
>>>> http://purl.uniprot.org/uniprot/P12345 as the subject? Because it 
>>>> can mean any of the below, even the protein class itself, how can a 
>>>> *semantic web* statement be made using it?
>>>
>>> http://purl.uniprot.org/uniprot/P12345 is meant to be used for 
>>> anything that isn't tied to a specific representation, hoped that 
>>> would be clear?
>>
>> There are proteins, and there are records about proteins. Records 
>> come in different formats. If I make a statement using this url, is 
>> is about the record? or the protein? How should the agent come to know?
>>
>> -Alan
>>
>>
>>
>
>
> Eric Neumann, PhD
> Senior Strategist, Teranode Corporation
> W3C co-chair Healthcare and Life Sciences Interest Group
> MIT Fellow, Science Commons
> +1 781 856 9132
>
>

Received on Monday, 16 July 2007 14:45:30 UTC