LC Comments, 2.5 from Bijan Parsia on 2004-03-05 (public-webarch-comments@w3.org from March 2004)

From: Bijan Parsia <bparsia@isr.umd.edu>
Date: Fri, 5 Mar 2004 23:49:42 +0100
To: public-webarch-comments@w3.org
Message-Id: <64062668-6EF7-11D8-BF98-0003939E0B44@isr.umd.edu>

"""It is tempting to guess the nature of a resource by inspection of a 
URI that identifies it. However, the Web is designed so that agents 
communicate resource state through representations, not identifiers. In 
general, one cannot determine the Internet Media Type of 
representations of a resource by inspecting a URI for that resource. 
For example, the ".html" at the end of "http://example.com/page.html" 
provides no guarantee that representations of the identified resource 
will be served with the Internet Media Type "text/html". The HTTP 
protocol does not constrain the Internet Media Type based on the path 
component of the URI; the server is free to return a representation in 
PNG or any other data format for that URI."

First sentence talks about inferring the *nature* of a *resource* by 
URI inspection (i.e., inferring that <http://ex.org/#BijanThePerson> 
rdf:type Person. from the URI alone). But the third sentence through 
the rest of the paragraph talks about inferring the Mimetype of the 
*representation* of the (state of) the resource. If you mean to 
discourage both practices, some serious reworking is in order.

"""Resource state may evolve over time. Requiring resource owners to 
change URIs to reflect resource state would lead to a significant 
number of broken links. For robustness, Web architecture promotes 
independence between an identifier and the identified resource."""

I just wonder how this is different from:

"""Resources may come and go over time. Requiring resource owners to 
abandon URIs to reflect resource non-existence woudl lead to a 
significant number of broken links. For robustness, Web architecture 
promotes independence between an identifier and the identified 
resource."

Of course, you might say that abandoning URIs isn't what's required, 
but rather maintaining legacy state. But then you've either changed the 
resource (to something "representing" the nonexistence resource), or 
you return representations reflecting the state of a nonexistence 
resource. Of which there isn't any.

(Note that I'm not talking about imaginary entities, but ones who have 
ceased to exist.)

The logic of avoiding broken links suggests that temporal URL ambiguity 
might be useful for Web robustness (which might not be the same as 
correctness).


"""Good practice: URI opacity

Agents making use of URIs MUST NOT attempt to infer properties of the 
referenced resource except as licensed by relevant specifications.""""

This says nothing about not inferring properties of the retrieved 
representations.

Cheers,
Bijan Parsia.

Received on Friday, 5 March 2004 17:49:43 UTC