- From: Booth, David (HP Software - Boston) <dbooth@hp.com>
- Date: Fri, 2 Feb 2007 01:47:32 -0500
- To: "Jonathan Rees" <jonathan.rees@gmail.com>, "public-semweb-lifesci" <public-semweb-lifesci@w3.org>
- Cc: "Susie Stephens" <susie.stephens@oracle.com>
Re: http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/Documents?actio n=AttachFile&do=get&target=getting-information.txt My overall comment: Yes! I believe a URI resolution ontology could significantly help address these problems, while still permitting URIs to be based on the http scheme, thus facilitating bootstrapping and minimizing barriers to adoption. Some specific comments follow. > URI Resolution: Finding Information About a Resource > Jonathan Rees, Alan Ruttenberg, Matthias Samwald > . . . > > Problem statement > . . . Nice problem description! > What is the received wisdom? > > - Don't mint non-URL URI's. (TimBL) > [good as far as it goes, but we may not be in a > position to choose] Meaning what? Others might ignore this advice? I think we should still advise what we think is best. > > - Mint URL's whose hostname specifies a long-lived server that will > maintain the resource at the given URL in perpetuity. > (Publishers, > libraries, and universities are in good positions to do this.) > [good as for as it goes, but user may not be in control, or may > find quality name management to be beyond his/her grasp] Good, but of course the long-lived server could also host a pointer to the resource, and perhaps some other metadata about it, rather than the resource itself > > - Use a web cache such as Apache or Squid, and a proxy configuration > on the client, to deliver the correct content when a URL > is presented > that can't or shouldn't be used directly. > (Dan Connolly) > [this is a possible solution... see below] Nice idea! This sounds functionally equivalent to a special purpose protocol resolver, except that the resolving smarts are factored out of the client/agent that is requesting the data, which seems to me like a distinct advantage. > > - Use LSID's. LSID resolvers are very similar to web caches in that > an intermediate server is deployed to map URIs. > [requires maintenance of an LSID resolver; not all problematic > URI's are LSID's] I think you should separate the question of using LSIDs (as identifiers) from the question of using LSID resolution (as a protocol). There is no need to use LSIDs (as identifiers) in order to use LSID resolution. As I have described in "Converting New URI Schemes or URN Sub-Schemes to HTTP" http://dbooth.org/2006/urn2http/ , you can instead use specialized http prefixes that are resolved using LSID resolvers by agents that know about LSID resolution, and resolved using HTTP by other agents as a fallback. This allows good old HTTP to act as a best-attempt bootstrapping mechanism for locating basic metadata about the resource (and potentially about LSID resolution). > > - If the type of the representation is unuseable, use content > negotiation and/or GRDDL to get the right type of resource. > [can Alan say more about why he dislikes content negotiation?] I admit that I am much more drawn to the explicitness of GRDDL than the invisible hokey-pokey of content negotiation. However, I'm not sure that I understand the intent of this item. If you are merely talking about equivalent data/metadata being served using different media types, then content negotiation seems fine. But if you are talking about the representation being unusable because it contains different information than what you need (e.g., you need more metadata), then GRDDL sounds more appropriate. > . . . > - To relate a non-information-resource to information about it, > mint URI's of the form http://example.org/foo#bar to name the > resource, with the convention that the URI http://example.org/foo > will name an information resource that describes it. > [obscure hack, probably too late to take hold, e.g. > ontology http://xmlns.com/foaf/0.1/ doesn't use #] I'm surprised to see this characterized as an "obscure hack", since I thought it was accepted practice in the RDF world. > > What would a good solution be like? > > Observation: We need information in order to find information. > . . . Good. > Proposal: A URI resolution ontology. Yes! > . . . > . Retrieval methods: direct; URI transformation; SPARQL; web > service Yes, also: Rules for associating specialized http URI prefixes with special protocol resolvers, such as LSID resolvers, as described in http://dbooth.org/2006/urn2http/#multiple-owners . > . . . > - Disadvantage: you need an OWL engine to interpret resolution > information represented in this way, and not all applications have > an OWL engine. [so why not get one and link it in?] Perhaps a proxy such as Squid could do this automatically. So if a client/agent discovers a new URI and does an HTTP GET on it (through the proxy) to learn about the associated resource, then the response could include some OWL resolution information intended for the proxy. Of course, the proxy would have to be able to recognize when it should intercept this information as opposed to merely passing it through to the client/agent. I suppose this might be done using HTTP headers, but perhaps there are other ways it could be done also, that would require less server configuration. Would there be safe ways for the proxy to intercept this information if it is in the body of the response? Sniffing the body seems risky. David Booth
Received on Friday, 2 February 2007 06:48:53 UTC