Re: My task from last week: Semantic free identifiers from Andrea Splendiani on 2011-06-20 (public-semweb-lifesci@w3.org from June 2011)

From: Andrea Splendiani <andrea.splendiani@bbsrc.ac.uk>
Date: Mon, 20 Jun 2011 19:08:43 +0000
To: "Vagnoni,Matthew M" <MMVagnoni@mdanderson.org>
Cc: 'James Malone' <malone@ebi.ac.uk>, HCLS <public-semweb-lifesci@w3.org>
Message-ID: <339406500866319@jngomktg.net>

Hi,

sorry to jump on this thread like this...

To be honest, I'm kind of concerned by the insistence on semantic-opaque
identifiers. I understand the reason for them, but I think they clash a bit
with other considerations:
- there is a lot of verbosity paid for having RDF 'readable'. Opaque
identifiers make it de-fact unreadable.
- while something that ends in 'partOf' hints at a partOf kind of
relation... the full ID is 'prefix/partOf'. So there is not that much
semantic commitment after all.
- if I see some successful resource like wikipedia, it seems all URIs are in
clear (same goes for DBPedia... although that may be a step too far. Did I
see classes 'in' URIs?)
- in a continuum between web and semantic web, perhaps IDs are not only
intended to be 'understood' by machines.

Again, I understand the reason for them. But is it worth the reduced
intuitiveness ? Or the added complexity to retain a bit of it ?

best,
Andrea
 

Il giorno 20/giu/2011, alle ore 19.26, Vagnoni,Matthew M ha scritto:

> Consider this: 
> 
> SELECT ?name
> WHERE { ?s :contains :MRN; ?s a :Patient; rdfs:label ?name}
> 
> --VERSE--
> 
> SELECT ?name
> WHERE {
>                ?s ?p ?o; rdfs:label ?name.
>                ?p rdfs:label "contains".
>                ?o rdfs:label "Master Record Number".
>                ?s a ?type.
>                ?type rdfs:label "Patient".
> }
> 
> Except the above would introduce the possibility of querying from the
wrong graph (when importing multiple graphs), not be supported by
autocomplete/lookup features, and result in a huge query performance
degradation.  This also assumes you know the correct punctuation, spelling,
etc.  Otherwise you can use a FILTER expression and REGEX, but that is even
slower and requires even more typing.
> 
> OR
> 
> PREFIX mrn: <http://www.mdanderson.org/clinical#fge29430s>
> PREFIX contains: <http://www.mdanderson.org/clinical#fdk30929a>
> PREFIX patient: <http://www.mdanderson.org/clinical#jkl23902c>
> SELECT ?label
> WHERE { ?s contains: mrn:; a patient:; rdfs:label ?name}
> 
> This final method is being proposed.  Obviously, you can see this just
hides the problem and makes it even more difficult to maintain.  Now the
semantic references are even more difficult to maintain, because they are
stored as plain text in a file or sparql query.  You cannot query for them
or consult metadata about them.  It introduces more typing and doesn't
provide any real improvement over the first case (using semantic
identifiers) because you are using semantic identifiers but instead of being
URIs they are now SPARQL Prefixes!  
> 
> Someone might come along still and use MRN to mean Magnetic Resonance
Neuroimaging instead of Master Record Number, but at least they can easily
find MRN in the model by doing a simple sparql query: 
> SELECT ?p ?o
> WHERE {:MRN ?p ?o}
> 
> Or they can quickly jump to that concept by referring to the URI.  In
TopBraid Composer there is a text box where you can put a URI in and it will
automatically jump to that resource.  
> 

Andrea Splendiani
Senior Bioinformatics Scientist
Centre for Mathematical and Computational Biology
+44(0)1582 763133 ext 2004
andrea.splendiani@bbsrc.ac.uk

Received on Monday, 20 June 2011 19:09:58 UTC