Re: HTTP, URIrefs and resources not "on the web" from Patrick Stickler on 2002-05-30 (www-talk@w3.org from May to June 2002)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Thu, 30 May 2002 09:29:11 +0300
To: ext Graham Klyne <GK@ninebynine.org>
CC: www talk <www-talk@w3.org>
Message-ID: <B91BA167.15B21%patrick.stickler@nokia.com>
On 2002-05-29 13:42, "ext Graham Klyne" <GK@NineByNine.org> wrote:

> (Patrick, this response is delayed because I was waiting for permission to
> post some off-list messages from Pat Hayes to WWW-Archive)
> 
> At 11:42 AM 5/27/02 +0300, Patrick Stickler wrote:
>> On 2002-05-23 20:46, "ext Graham Klyne" <GK@ninebynine.org> wrote:
> 
> [...]
> 
>>> Er, you're right.  This will be very sketchy:
>>> 
>>> 1. The interpretation of a fragment identifier depends on the MIME type of
>>> the representation it's applied to.
>>> 
>>> 2. URIs without fragment identifiers are generally presumed to map to some
>>> resource for which a Web representation (or several) can be retrieved.
>>> 
>>> 3. RDF uses URI-references to denote things that aren't necessarily
>>> web-retrievable.
>>> 
>>> I think so far is pretty standard stuff.
>>> 
>>> The difficulty with someurl#frag in RDF arises when you say that this is
>>> interpreted by:
>>> (a) dereferencing 'someurl'.
>>> (b) interpreting #frag according to what you get back.
>>> This doesn't work well for RDF, because different MIME types can be
>>> returned, with different interpretations of the fragment identifier, where
>>> RDF requires that a URI ref have just one denotation under any given
>>> interpretation.
>>> 
>>> So my approach for interpreting someurl#frag (and this is largely inspired
>>> by comments from TimBL and Pat Hayes, though any errors are of course all
>>> mine) is this:
>>> 
>>> (A) *assume* that 'someurl' indicates a resource which has an RDF
>>> representation.  (If it's not dereferencable as such on the web, so be it,
>>> but I must assume its notional existence)
>>> 
>>> (B) when used in an rdf document, 'someurl#frag' means the thing that is
>>> indicated, according to the rules of application/rdf+xml mime type as a
>>> "fragment" or "view" of the RDF document at 'someurl'.  If the document
>>> doesn't exist, or can't be retrieved, then exactly what that view may be is
>>> somewhat undetermined, but that doesn't stop us from using RDF to say
>>> things about it.
>>> 
>>> (C) the RDF interpretation of a fragment identifier allows it to indicate a
>>> thing that is entirely external to the document, or even to the "shared
>>> information space" known as the Web.  That is, it can be an abstract idea,
>>> like my cat or DanC's car.
>>> 
>>> (D) So any RDF document acts as an intermediary between web retrieval
>>> documents (itself, at least, and also any other web-retrievable URIs that
>>> it may use, including schema and references to other RDF documents) and
>>> some set of abstract or non-Web entities that it may describe.
>>> 
>>> That's it.  I think it's consistent with all the conventional web axioms,
>>> but it also provides an handling of URIrefs and their denotation that is
>>> consistent with the RDF model theory and usage.  The "stretch", if there is
>>> one, is that it somewhat extends the idea of a "fragment" or "view" beyond
>>> the conventional idea that it's a physical part of a containing document.
>>> 
>>> If you accept this, then it becomes natural to take a view that URIs
>>> without fragment identifiers _should_ be reserved for indicating
>>> web-retrievable resources (when used in RDF), which is something TimBL has
>>> promoted.  This goes against quite a lot of actual RDF usage (mine
>>> included) so I don't think we can be too strict about that, but it seems a
>>> reasonable principle to aim for.
>>> 
>>> It also suggests a possible answer to the question about the web and
>>> URIs.  It is sometimes claimed that to be on the web means to have a
>>> URI.  So are people and cats and dogs and cars "on the web"?  If I clarify
>>> the definition of "on the web" to not include things that have URI
>>> references, then the answer to that question can be "no".  But using RDF,
>>> we are still free to talk about these things without actually having to
>>> claim that they are "on the web", by using URI-references rather than "1st
>>> class" URIs.
>> 
>> All in all I can accept this point of view as reasonable and workable,
>> with two exceptions or caveats (and I appreciate that your comments
>> were offered off-the-cuff and quickly -- so feel free not to respond
>> if any of the following is off the mark from your actual views):
> 
> Thanks!
> 
>> 1. I wouldn't presume to require every uriref someuri#frag
>> that is used to denote a resource in RDF to require that
>> someuri resolve to a representation of an RDF instance.
> 
> Thus far, I agree.  That's what I tried to say.
> 
>>  The
>> real requirement is simply that it consistently resolve to
>> an instance of the same MIME type such that the fragment
>> identifier has a consistent interpretation in all cases.
>> Yes, that's more difficult to determine/ensure, but that's
>> really what the true requirement distills down to, I think.
> 
> My point here was that when used within RDF, the #fragid would be presumed
> to be interpreted as if with respect to the RDF MIME type -- I think that's
> needed for consistency within RDF.  If there's also an
> application/mydatatype with its own interpretation of fragments, there's no
> reason that a particular use of RDF shouldn't align its use of fragments
> accordingly.
> 
> Example:  an HTML document http://www.example.org/doc may contain chapters
> written by different authors.  An RDF document can still make statements like:
> 
>  <http://www.example.org/doc#chap1> dc:author "First author" .
>  <http://www.example.org/doc#chap2> dc:author "Second author" .
> 
> etc.  This usage presumes the (possible or notional) existence of an RDF
> document representing the same resource, in which statements about the
> fragments are made accordingly; e.g.
> 
>  <http://www.example.org/doc#chap1> ex:isPartOf
> <http://www.example.org/doc> .
>  <http://www.example.org/doc#chap2> ex:isPartOf
> <http://www.example.org/doc> .
> 
> (for some appropriate interpretation of ex:isPartOf).
> 
> See also some comments by Pat Hayes (from a private exchange, posted to
> www-archive with permission):
> - http://lists.w3.org/Archives/Public/www-archive/2002May/0018.html
> - http://lists.w3.org/Archives/Public/www-archive/2002May/0019.html
> - http://lists.w3.org/Archives/Public/www-archive/2002May/0020.html
> and also:
> - http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Feb/0494.html

Right. I'll have a look. Thanks.

> 
>> 2. I'm not comfortable with the very last comment, which seems to suggest
>> that "1st class" URIs would not be used to denote things which are not
>> "on the web". Whether you have foo://bar#cat or foo://bar/cat in no
>> way determines whether the thing is "on the web" and a representation
>> of it is obtainable. This is perhaps the primary point of friction
>> between the needs of "traditional" web applications which are concerned
>> with stuff that is web accessible, and newer semantic web applications
>> which, in addition to being concerned with stuff that is web accessible,
>> is also concerned with alot of stuff that is not web accessible, either
>> because it's not digital, or because it is abstract.
> 
> My comments about "on the web" were definitely half-baked.
> 
> Yes, current usage does rather go against this.  There is an issue to be
> squared here:  if a URI <foo://bar/cat> is used to describe an abstract
> concept, and subsequently a document is put on the web at that URL, how are
> these related?  What does the URL refer to?

Well, that is just a delayed error similar to using a mailto: URI to
denote both a mailbox and a person.

It is simply an error for a URI to denote two things, and if a URI
denotes a non-web-accessible resource, then it should be considered
an error if that URI actual resolves to a representation of some
resource. (this also applies to namespace URIs, but I'm willing to
"look the other way" in their case given their special status, and
allow them to resolve to RDF or RDF-compatable knowledge)

> I'm afraid I just don't buy your proposals about URI taxonomies or
> additional mechanisms here.  (By which, I mean that I don't accept them as
> universal proposals:  I have no argument with their use as a convenient
> mechanism by you or any other developers.  But I think it must also be
> allowable to strip any URI down to the minimalist purpose of identifying a
> resource, without any extra baggage or assumption about what is identified
> based on the form of URI.)

I agree that the taxonomies and specific URI schemes need not be manditory,
and I stated so in my earlier posting. It is OK if the URI is fully opaque
itself with regards to the nature of the resource, whether it is or is not
on the web -- but if a URI is dereferencable via HTTP, then it should be
known by the HTTP server in question, in some manner, whether the resource
is or is not web-accessible, and if not accessible, there should be a clear,
unambiguous response to the client that it attempted to GET a non-accessible
resource -- ideally returning metadata about the resource, for the benefit
of the client. And likewise, there should be official HTTP methods to GET
and PUT metadata about a resource "hosted by" a given server (non-web
accessible resources don't really "reside" on a server) as part of the
fundamental web architecture.

Hence my proposed set of 6xx response codes and one or more new methods
specific to metadata. I actually like Andy Seaborne's GET-META and PUT-META
which could be extended to include also POST-META, to provide for
retrieval, addition, and replacement of resource specific metadata
respectively. This fits nicely IMO with the REST archtecture and provides
for interaction not only with resources but with their metadata in a
comparable manner.

If such functionality existed as part of the core web architecture, then
I would be far more motivated to consider the use of http: URLs for
non-web resources as acceptable. Given the present definition of HTTP
however, I consider such usage to be misleading and very bad practice.

Cheers,

Patrick


--
               
Patrick Stickler              Phone: +358 50 483 9453
Senior Research Scientist     Fax:   +358 7180 35409
Nokia Research Center         Email: patrick.stickler@nokia.com
Received on Thursday, 30 May 2002 02:25:55 UTC