- From: Patrick Stickler <patrick.stickler@nokia.com>
- Date: Mon, 27 May 2002 11:42:39 +0300
- To: ext Graham Klyne <GK@ninebynine.org>, Mark Baker <distobj@acm.org>
- CC: "Sean B. Palmer" <sean@mysterylights.com>, www talk <www-talk@w3.org>
On 2002-05-23 20:46, "ext Graham Klyne" <GK@ninebynine.org> wrote: > At 12:58 PM 5/23/02 -0400, Mark Baker wrote: >> On Thu, May 23, 2002 at 04:47:00PM +0100, Graham Klyne wrote: >>> At 03:43 PM 5/23/02 +0100, Sean B. Palmer wrote: >>>> The RDF Core WG would certainly want the SW and WWW to be >>>> interoperable, and yet after repeated debates spurred mainly by Aaron, >>>> fragIDs in RDF haven't been deprecated. That speaks volumes. >>> >>> As one who used to think that fragids in RDF were broken... >>> >>> I've been thinking about this point some, and I'm coming round to a view >>> that fragid's are not only OK with RDF, but their use is to be preferred >>> for many RDF resources, and that the SW/WWW integration can work just >>> fine. I've not yet had time to sit down and straighten my thoughts ... >> too >>> many other things to do! >> >> Process foul! 8-) You can't do that. We need reasons, damnit! > > Er, you're right. This will be very sketchy: > > 1. The interpretation of a fragment identifier depends on the MIME type of > the representation it's applied to. > > 2. URIs without fragment identifiers are generally presumed to map to some > resource for which a Web representation (or several) can be retrieved. > > 3. RDF uses URI-references to denote things that aren't necessarily > web-retrievable. > > I think so far is pretty standard stuff. > > The difficulty with someurl#frag in RDF arises when you say that this is > interpreted by: > (a) dereferencing 'someurl'. > (b) interpreting #frag according to what you get back. > This doesn't work well for RDF, because different MIME types can be > returned, with different interpretations of the fragment identifier, where > RDF requires that a URI ref have just one denotation under any given > interpretation. > > So my approach for interpreting someurl#frag (and this is largely inspired > by comments from TimBL and Pat Hayes, though any errors are of course all > mine) is this: > > (A) *assume* that 'someurl' indicates a resource which has an RDF > representation. (If it's not dereferencable as such on the web, so be it, > but I must assume its notional existence) > > (B) when used in an rdf document, 'someurl#frag' means the thing that is > indicated, according to the rules of application/rdf+xml mime type as a > "fragment" or "view" of the RDF document at 'someurl'. If the document > doesn't exist, or can't be retrieved, then exactly what that view may be is > somewhat undetermined, but that doesn't stop us from using RDF to say > things about it. > > (C) the RDF interpretation of a fragment identifier allows it to indicate a > thing that is entirely external to the document, or even to the "shared > information space" known as the Web. That is, it can be an abstract idea, > like my cat or DanC's car. > > (D) So any RDF document acts as an intermediary between web retrieval > documents (itself, at least, and also any other web-retrievable URIs that > it may use, including schema and references to other RDF documents) and > some set of abstract or non-Web entities that it may describe. > > That's it. I think it's consistent with all the conventional web axioms, > but it also provides an handling of URIrefs and their denotation that is > consistent with the RDF model theory and usage. The "stretch", if there is > one, is that it somewhat extends the idea of a "fragment" or "view" beyond > the conventional idea that it's a physical part of a containing document. > > If you accept this, then it becomes natural to take a view that URIs > without fragment identifiers _should_ be reserved for indicating > web-retrievable resources (when used in RDF), which is something TimBL has > promoted. This goes against quite a lot of actual RDF usage (mine > included) so I don't think we can be too strict about that, but it seems a > reasonable principle to aim for. > > It also suggests a possible answer to the question about the web and > URIs. It is sometimes claimed that to be on the web means to have a > URI. So are people and cats and dogs and cars "on the web"? If I clarify > the definition of "on the web" to not include things that have URI > references, then the answer to that question can be "no". But using RDF, > we are still free to talk about these things without actually having to > claim that they are "on the web", by using URI-references rather than "1st > class" URIs. All in all I can accept this point of view as reasonable and workable, with two exceptions or caveats (and I appreciate that your comments were offered off-the-cuff and quickly -- so feel free not to respond if any of the following is off the mark from your actual views): 1. I wouldn't presume to require every uriref someuri#frag that is used to denote a resource in RDF to require that someuri resolve to a representation of an RDF instance. The real requirement is simply that it consistently resolve to an instance of the same MIME type such that the fragment identifier has a consistent interpretation in all cases. Yes, that's more difficult to determine/ensure, but that's really what the true requirement distills down to, I think. 2. I'm not comfortable with the very last comment, which seems to suggest that "1st class" URIs would not be used to denote things which are not "on the web". Whether you have foo://bar#cat or foo://bar/cat in no way determines whether the thing is "on the web" and a representation of it is obtainable. This is perhaps the primary point of friction between the needs of "traditional" web applications which are concerned with stuff that is web accessible, and newer semantic web applications which, in addition to being concerned with stuff that is web accessible, is also concerned with alot of stuff that is not web accessible, either because it's not digital, or because it is abstract. The question about whether a thing is "on the web" (has an accessible representation) or not "on the web" and whether that distinction can be determined from the URI or URIref itself is, I think, pivotal, and one that needs more attention and hopefully some resolution in the not so distant future. The present web architecture, insofar as I can see, does not provide a clear and consistent answer to this. A 404 error seems the closest we can get, but that doesn't really tell us whether the resource is not "on the web" versus "on the web" but not presently accessible. There seem to be two approaches to making this distinction explicit: 1. On a per-instance basis, by defining in some manner metadata about the resource denoted by the URIref which clarifies whether it is web accessible or not 2. On a per-class basis, by defining for the URI scheme or URI class whether instances of that scheme or class denote resources which are or are not on the web (e.g. [1], [2]) Both have advantages. The former in terms of flexibility. The latter in terms of economy. Rather than trying to make it an either-or choice which is unlikely to be resolved by any amount of discussion or debate, perhaps we should provide for both. In conjunction with specific URI schemes or classes which provide as part of their semantics whether the resources they denote are or are not "on the web", we could also define a new set of HTTP response codes, e.g. 6xx which indicate "the resource denoted by the URI attempted to be dereferenced is not web-accessible" and the particular codes indicate the nature of the actual response, which could be various degrees and/or types of metadata known about the resource, e.g. 600 No further information available about resource 601 Summary of information known about resource (RDF encoded) 602 Listing of servers hosting information about resource (RDF encoded) ... etc. Thus, a 4xx response truly means that the resource is known or presumed to be web accessible, and the server failed to provide a representation for it -- whereas a 6xx response makes it clear that one cannot obtain any representation for the resource in question (at least insofar as the particular server is concerned). In addition to the above, add an HTTP method such as "INFO" which, even for web-accessible resources, would force a 6xx response from the server, enabling one to obtain knowledge about any arbitrary resource whether it was web accessible or not. Of course, the "HEAD" method theoretically could be used, but (a) would still perhaps confuse web accessible versus non accessible resources and (b) would not provide for the richness of RDF for capturing the knowledge associated with a resource. The benefit of this particular approach is that, based on specific classes and schemes of URIs, or based on per-instance knowledge, a server can respond usefully to attempts to dereference a URI which denotes a resource which is not web-accessible, and then notify a client accurately of the nature of the resource, providing useful informaiton to the client about the resource or where such information could be obtained. Thus, whether each URI is qualified individually as to its nature of accessibility (which would be required for e.g. http: URIs denoting non web accessible resources) or whether qualified by URI scheme or URI class [1], [2] would be up to the creator of the URI and boils down to a simple matter of flexibility versus economy. The web archtecture itself would remain agnostic about it, but still provide that critical distinction regarding accessibility required for the next generation of semantic web agents. Cheers, Patrick [1] http://ietf.org/internet-drafts/draft-pstickler-voc-01.txt [2] http://ietf.org/internet-drafts/draft-pstickler-uri-taxonomy-00.txt -- Patrick Stickler Phone: +358 50 483 9453 Senior Research Scientist Fax: +358 7180 35409 Nokia Research Center Email: patrick.stickler@nokia.com
Received on Monday, 27 May 2002 04:38:52 UTC