- From: Alan Ruttenberg <alanruttenberg@gmail.com>
- Date: Sat, 12 May 2007 08:07:40 -0600
- To: Eric Jain <Eric.Jain@isb-sib.ch>
- Cc: public-semweb-lifesci <public-semweb-lifesci@w3.org>
On May 12, 2007, at 4:49 AM, Eric Jain wrote: > Alan Ruttenberg wrote: >> Would it be possible to add a service so that I can get from the >> lsid directly to rdf and xml versions at least? Would it be >> correct to assume that all lsids in uniprot have such versions? > > The only common format in UniProt is RDF (e.g. there is no XML > representation of the taxonomy data). > > <http://beta.uniprot.org/uniprot/P12345> could return different > formats based on the "Accept" header, however this would complicate > caching... I don't like content negotiation either. > Another option (which would also allow you to link to rather than > retrieve a specific representation) would be an optional "format" > parameter. This is a reasonable choice. However I think that there ought to be a link based on the identifier. Ideally the identifier itself would be resolvable. Not sure what you mean by being able to link rather than retrieve. >> Are the LSIDs supposed to be able to be resolved by an lsid >> resolver? If so is there one that ebi runs that I could play with? > > None that I'm aware of, and I'm afraid setting up a "correct" > resolver that behaves as required by the specs would be difficult, > if not impossible :-( > > The question is, what is worse crime against humanity: Misusing an > existing scheme, or inventing your own :-) My gut is that the the former is worse. We want to present people with predictable systems. With the misuse of an existing system, people's expectations that they can go to the specification and figure out what to do turns out not to be the case, and this reduces their confidence in both the specification and in the provider who misuses the spec. >> I might suggest the following: >> http://beta.uniprot.org/uniprot/what/ >> urn:lsid:uniprot.org:uniprot:P12345 >> return some rdf that lists the specific formats that resource is >> available in, and urls where they can be fetched from? >> Or if you have some simple rules for forming the URLs, could you >> share those? > > The simple rule is to append .ext to the URL of the resource, where > "ext" is rdf|xml|fasta|txt|... If the URL of the resource was based on the LSID then that would be a reasonable solution. It's rather complicated to have to fetch a page using the LSID, figure out what it redirected to, and only then do the rewrite. It also doesn't always work: e.g. http://beta.uniprot.org/? query=urn:lsid:uniprot.org:core:Citation_Statement goes to http://dev.isb-sib.ch/projects/uniprot-rdf/owl/citation_statement.html but http://dev.isb-sib.ch/projects/uniprot-rdf/owl/citation_statement.rdf doesn't yield rdf. BTW, I just tried the following and it gives an error. http://beta.uniprot.org/? query=urn:lsid:uniprot.org:annotation:PRO_0000123886 >> Do you assign LSIDs to those resources too? If so is there a way >> to figure out which are "yours" and which are "theirs"? > > One of the main reasons for using LSIDs was that I need proper URIs > for all the resources we reference, and most resources have two > meter long, frequently changing cgi-bin URLs (OK, I'm exaggerating, > but not much). This is a nasty problem which any solution should deal with. I think there is a better way, however, than having URIs that one can't reliably resolve. Here is the solution we've been prototyping for the HCLS demo: http:// sw.neurocommons.org/2007/uri-explanation.html We've got the basic framework up and are working to fill in all the redirections. We also need to do a little more work to have the "abstract records" return rdf metadata explaining where each of the concrete instances live. I was asking the question so that we could add Uniprot to the prototype. Already we have, e.g. http://purl.org/ commons/record/uniprotkb/P12345 which would be a reasonable alternative to using urn:lsid:uniprot.org:uniprot:P12345 With this scheme, we are able to have unambiguous URLs that all resolve to the resources they are intended to refer to (or via a 303 explain why they can't). Because purl.org is a redirect service, we can adjust the urls that are redirected to if the underlying database changes location. A suggestion by Mark Wilkinson was that we also make available the rewrite rules as rdf so that agents that want to avoid the redirection know how to do the rewrites in their application. The administration of the redirections would be set up so as to be under the control of the community. Science Commons volunteers to do the initial grunt work and ongoing administration, but the idea would be to set up some organization so that access is available to responsible members of the community so that we don't get into a situation where we are dependent on any individual organization. If you have some time, perhaps we could talk off line about this... Regards, Alan > > Moreover, what is "ours" and what is "theirs" isn't always clear > (again, consider the taxonomy data, which is basically the NCBI > taxonomy), though in general if it resolves to one of our servers, > then it's probably ours. >
Received on Saturday, 12 May 2007 14:05:48 UTC