RE: LSIDs and ontology segmentation from Xiaoshu Wang on 2006-07-13 (public-semweb-lifesci@w3.org from July 2006)

From: Xiaoshu Wang <wangxiao@musc.edu>
Date: Thu, 13 Jul 2006 14:22:30 -0400
To: <public-semweb-lifesci@w3.org>
Message-ID: <001901c6a6a9$4d923870$4a741780@bioxiao>
Dear Mark,

> The problem with using document#fragment URLs to identify 
> ontology nodes is that the defined behaviour for resolving 
> such an identifier is to drop the fragment (since that isn't 
> available server-side anyway) and to return the entire 
> document... all 10Meg's of GO... each time...  We would 
> argue, therefore, that the URL (if you adopt its default
> behaviour) is not only a bit of a nuisance, it is a blocker 
> in some/many cases.

Again, this is a misconception by mixing a design issue with an naming
issue.  

First, there is no requirement to use the #fragment identifier.  For
instance, dublin core use the "slash" ID instead of the "hash" ID.  But
here, the "slash" vs. "hash" identifier is just an argument for the sake of
argument.  

If the designers of the GO think that each statements in the GO model is an
integral part of the entire model, to download the entire graph is what you
SHOULD do because otherwise you will misinterpret their original intent. If
we all interprete an ontology partially, the purpose of sharing an ontology
is lost and all the reasoners will be useless.

Thus, the debate should be about "if GO should be designed and deployed in
its current way" but not "if we should name it this or that". As I have
repeatly said in this mailing list, a big monolithic ontology should be
avoided.  But it is a design/deploy issue but not a naming issue.
 
> Here is where I think the LSID could really shine!  Unlike a 
> URL, the LSID does not have to return an entire document in 
> response to a getMetaData call.  Thus, if an LSID were used 
> as the identifier for an ontology node, the behaviour of the 
> getMetadata call could be, by convention or by standard, to 
> return only the relevant ontology fragment, where that 
> fragment was generated by e.g. the Rector Segmentation 
> generator in the background.

The LSID's getMetadata(lsid:x) only gives you a false but not an actual
solution.  Assume a particular call return a one RDF statement, for
instance,

lsid:x a exp:Foo.

What am you going to do about this statement?  Shall you follow the URI of
exp:Foo again? If so, by what principle.  If it is a URL, the problem comes
back to you again.  Of course, you can argue it should be made a LSID as
well, for instance, like

lsid:x a lsid:Foo

Now, let's assume all URI uses LSID and as you said "getMetadata call could
be, by convention or by standard, to return only the relevant ontology
fragment".  But who is to decide, what is the "relevant" fragment.  Let's
say the client, so we can say getMetadata( related_to_this ).  Then before
making a lsid:x->getMetadata(related_to_this), the client must have to know
exp:Foo->getMetadata(related_to_that).  Then, how much prior knowledge does
a client need to have in order to make the initial getMetadata() call? 

If say it is the server who decides what is the relevant fragment. First,
how can a server decide for the client.  Second, wouldn't it be the issue of
the ontology designer? 

Xiaoshu
Received on Thursday, 13 July 2006 18:23:01 UTC