Re: LSID: What's still needed to make it work within the semantic web? from Eric Jain on 2005-03-15 (public-semweb-lifesci@w3.org from March 2005)

From: Eric Jain <Eric.Jain@isb-sib.ch>
Date: Tue, 15 Mar 2005 17:34:10 +0100
To: Eric.Neumann@sanofi-aventis.com
CC: public-semweb-lifesci@w3.org
Message-ID: <42370E82.1030400@isb-sib.ch>

Eric.Neumann@sanofi-aventis.com wrote:
> Much of the discussion was around 
> what still needs to be done with the specification, so that LSID's 
> become a beneficial and practical element of the life science community. 

In my opinion, what is standing in the way of widespread adoption is not 
any lack of features, but complexity that many may see as unnecessary.

>     * What metadata accessible through LSID should be standardized;

Preferrably none :-) though a best practices document might be a good idea.

>     * Are URN-aware resolvers an acceptable means for data retrieval for
>       all members of the life science community? Are there any
>       alternatives that are simpler?

The web service stuff that is part of the current specification adds a lot 
of complexity. This is not to say there wouldn't be any use for the web 
services approach, but in my opinion it shouldn't be part of the core 
specification. The availability of a simple, RESTful solution (based on 
HTTP redirection) would almost certainly improve adoption.

>     * Guidelines for encoding data for common bioinformatics data types
>       in LSID; are we all clear what is data and what is metadata? Would
>       this include all kinds of RDF graphs that relate to the original
>       data item? Do we need best practices on utilizing common
>       ontologies such as GO within a data entry?

>     * How to specify Dynamic data (latest version) effectively (minimal
>       http calls of LSIDs) 

I would simply default to the latest version; most life science databases 
so far don't have any concept of versioned resources anyway...

Here are some properties we use that I would consider metadata: created, 
modified, replaces, curated, dataset, description, publisher, creator and 
rights. Not sure if it makes sense to retrieve these alone. On the other 
hand a machine readable description of available formats and versions may 
be useful, but this doesn't necessarily have to be part of the resolution 
mechanism. Currently I simply include something like the following in the 
web page that is returned as the default representation:

   <link href="P12345.rdf" type="application/rdf+xml" rel="alternate"/>
   <link href="P12345.fasta" type="text/fasta" rel="alternate"/>

Not sure what to do about versions, though.

Received on Tuesday, 15 March 2005 16:34:01 UTC