[metadataInURI-31] Review of 2006-10-16 draft

Noah Mendelsohn (noah_mendelsohn@us.ibm.com) wrote to www-tag:

> I am pleased to announce the availability of a new draft of the 
> finding: "The use of Metadata in URIs" [...]

I am pleased to read the draft. Thank you, Noah.

> Clearly review of of the recent changes is in order before we 
> publish, but there is a good chance that comments on other aspects of
>  the finding will be queued for consideration should we later wish to
>  republish.

I chide myself for arriving late to this party, apologize to the TAG in 
general and to Noah in particular, and offer a review in spite of 
circumstances.

> In short, I think it's about time to ship this.

I don’t agree. I raise the following issues as indicators of the need 
for and scope of revision.

• “The authority who creates a URI is responsible for assuring that it 
is associated with the intended resource, and that operations targeted 
to the URI manipulate or return the appropriate data.”
→ What is an authority? It cannot be an “authority” construct of RFC
3986, which construct is a character string, but the risk of confusion
looms. I could easily create a URI of the form 
“http://lists.w3.org/<nonsense>”. Would the act of creation make me an
authority? Would the act of creation obligate me to configure a mapping
from the URI to resource representations on the HTTP server at port 80
of lists.w3.org? Consider that I could have shown the form of the
hypothetical URI as a URI, replacing “<nonsense>” with “nonsense”. Such 
a URI would prompt W3C’s archival system to publish, using the nonsense 
URI, a hyperlink from the HTTP server at port 80 of lists.w3.org. In 
that case, which authority or authorities created the URI? Do the 
authorities truly incur an obligation to manage the URI? How does the 
widespread use and publication of example URIs affect the emerging
consensus? Some of the example URIs have an “authority” construct (per 
RFC 3986) which is a reserved DNS name under RFC 2606 (BCP 32). Some of 
the example URIs have an “authority” construct (per RFC 3986) which is 
syntactically a DNS name but which does not identify any DNS domain. 
Some of the example URIs have an “authority” construct (per RFC 3986)
which, inadvertently, is the DNS name of a domain which is operational 
in the Internet. So far I have been thinking of “http” URIs. Now 
consider URIs in the “isbn” URN namespace. What authority creates those 
URIs? Do such authorities incur the obligation to handle “operations 
targeted to the URI” in order to “return the appropriate data”?

• “the MIME media type that is likely to be returned by an HTTP GET”
→ Internet media types are not, in practice or per specification, bound
to MIME. RFC 4288 clarifies this matter. Use “Internet media type”
instead of “MIME media type”. As for phrasing and intent in general, try 
“the Internet media type likely to appear in response to an HTTP ‘GET’ 
request” or “the Internet media type of a representation of the resource”.

• “Constraint: Web software MUST NOT depend on the correctness of 
metadata inferred from a URI, except when the encoding of such metadata 
is documented by applicable standards and specifications.”
“Such standards and specifications include pertinent Web and Internet 
RFCs and Recommendations such as [URI], as well as documentation 
provided by the URI assignment authority.”
→ I caution against blessing the documentation that URI‐assignment 
authorities provide. Consider the hypothetical Slob Net, a 
URI‐assignment authority. Suppose that Slob Net claims, “Slob Net URLs 
ending in .xml return XML”. Suppose that a programmer outside of Slob 
Net takes Slob Net’s documentation at its word. Suppose that the naïve 
programmer sends an HTTP “GET” request which has Request-URI 
“/SlobNet.xml” and a “Host” header field which identifies a host under 
Slob Net’s control. Suppose that the response to the HTTP “GET” request 
has Status-Code “500”, a header field “Content-Type: text/html”, and 
entity-body “<html><title>Error</title><body>Error <img 
src="/frownyface.gif"></body></html>”. The naïve programmer’s software 
will ignore the declaration of media type, presume the use of XML, and 
stop at what appears to be a syntactical error. Suppose that, after 
encountering, in a Slob Net publication, a hyperlink whose target URI 
has the query “query=configuration.xml” and path “search”, the naïve 
programmer’s software sends an HTTP “GET” request which has Request-URI 
“/search?query=configuration.xml” and a “Host” header field which 
identifies a host under Slob Net’s control. Suppose that the response to 
the HTTP “GET” request has Status-Code “200”, a header field 
“Content-Type: text/html”, and entity-body 
“<html><title>Results</title><body><ol><li><a 
href="/miscellany/configuration.xml">configuration.xml</a></ol></body></html>”. 
The naïve programmer’s software will, again, ignore the declaration of 
media type, presume the use of XML, and stop at what appears to be a 
syntactical error.

• I did not have the time to treat other issues.

Received on Thursday, 26 October 2006 00:10:54 UTC