- From: Balaji S. Srinivasan <balajis@stanford.edu>
- Date: Sun, 15 Jul 2007 22:32:08 -0700
- To: public-semweb-lifesci <public-semweb-lifesci@w3.org>
statements with the subject being > http://beta.uniprot.org/entry/P12345 and another set makes statements about http://uniprot.org/entry/P12345. They are really talking about the same subject, > but our semantic web agent won't know that. If we had used the PURL, then we wouldn't have a problem. One solution is to have a "freshen_rdf" script that periodically goes through an RDF file or triplestore, does an HTTP GET on each unique URI, and updates the URI if it's been 301 redirected to a new location. People are probably going to end up doing this periodically anyway in order to validate each URI as pointing to a resolvable resource before doing anything nontrivial with the triplestore. Now, a naive GET on every URI might take some time, but it could be made more efficient by first resolving the namespace declarations at the beginning of the RDF file. For each namespace, such as beta.uniprot.org, you do one GET to see whether any 301 redirects have been set up. Perhaps the cleanest way to do this is for the EBI people to have metadata at "http://beta.uniprot.org/uniprot/ redirect.rdf" (or a similar URI) which contains a set of triples with redirect information. This might be as simple as a rewriting regex. If it's just a regex, then you can apply it to quickly freshen all the URIs from this namespace without having to do HTTP GETS on each of them. Alternatively, that redirect.rdf file might contain a table of "sameAs" mappings which, again, can be used to freshen the URIs in your triplestore. -- Balaji S. Srinivasan, Ph.D. Stanford University Lecturer, Depts. of Statistics and Computer Science 318 Campus Drive, Clark Center S251 (650) 380-0695 balajis@stanford.edu http://jinome.stanford.edu On Jul 15, 2007, at 9:34 PM, Alan Ruttenberg wrote: > > On Jul 15, 2007, at 1:53 PM, Eric Jain wrote: >> Alan Ruttenberg wrote: >>> The point of having the PURLs is to ensure that there is a >>> mechanism for handling three cases that LSIDs were intended to >>> address (but which can be addressed without the trouble of >>> introducing a separate resolving mechanism) >>> 1) To be immune from the "actual URL of the representation" >>> changing. (e.g. beta.uniprot.org goes out of beta) >> 1) We'll do a 301 "permanent" redirection, promise. > > Yes, but how will we handle the case where some set of people make > statements with the subject being > http://beta.uniprot.org/entry/P12345 and another set makes > statements about http://uniprot.org/entry/P12345. They are really > talking about the same subject, but our semantic web agent won't > know that. If we had used the PURL, then we wouldn't have a problem. > > Comments to your other points in separate email. > > -Alan > > > >
Received on Monday, 16 July 2007 14:04:55 UTC