Content negotiation flamewar (was: Re: "Hash URIs" and content negotiation) from Richard Cyganiak on 2006-11-13 (semantic-web@w3.org from November 2006)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Mon, 13 Nov 2006 11:33:26 +0100
To: Alan Ruttenberg <alanruttenberg@gmail.com>
Cc: Karl Dubost <karl@w3.org>, Semantic Web <semantic-web@w3.org>
Message-Id: <F89884CE-DA3A-47EB-B593-27975940A593@cyganiak.de>

On 13 Nov 2006, at 06:03, Alan Ruttenberg wrote:
> Why can't our agents retrieve RDF (and RDF only) when they need  
> discovery information like this.

Because the Web is designed to be data format agnostic. The only  
reason why we can contemplate evolving the HTML-based WWW into an RDF- 
based Semantic Web today is because the Web was designed to be usable  
with any kind of data format in the first place.

> Here is another proposal. Conneg is vastly simplified: For uri U,  
> the only option is to retrieve RDF instead of the resource. That  
> rdf would itself be given a name(URI) distinct from U. That rdf  
> returned neither describes U, nor what U is about (so if U names an  
> rdf document, the conneged RDF is not the same as the RDF resource  
> U names). Rather it is RDF that describes the set of documents that  
> the server considers relevant to serve in place of U. These can  
> either be rdf:about the same subject (such as translations in  
> different human languages), or same as the document, where sameas  
> means there is a lossless, unambiguous, machine implementable  
> translation between the two documents (same size gif->png ok gif- 
> >jpeg quality 8 not ok). The RDF would use some standard set of  
> relations to describe the relationship of the proposed documents to  
> U, and
> various properties of these documents (like their file formats,  
> languages, etc).  Based on this information, the agent can decide  
> whether the original document, or some other, is relevant to  
> retrieve for the task at hand. In a browser, I would expect the  
> browser to show me a different URI in the address bar if an  
> alternate document was chosen.

Let's take this a bit further.

First, the proposal requires two HTTP requests to retrieve a  
representation. I think we can do better. The client could indicate  
in its request that it prefers a representation with certain  
features, and the server would send along the best match if availabe.  
This should be optional of course. The server should be allowed to  
leave the choice entirely to the client, and the client should be  
able to indicate that it only wants the format inventory and no  
matching representation.

Second, in the vast majority of cases, a client will be interested  
only in *one* representation, and the other URIs and details of the  
other representation in the response will just be noise. So, to keep  
bandwidth down and to keep client-side processing simple, the server  
should tailor the answer so it just contains

- the URI where to find the best-matching representation, and
- indications about how the other available representations differ  
from that one.

The client could then vary its request to ask for the location of  
other representations.

Third, the proposal requires all Web clients to support RDF. That's a  
heavy burden, because at the moment HTTP is such a simple protocol,  
and RDF parsers are fairly complex. So let's encode the format  
inventory in a simpler way. How about HTTP headers? They are dead  
easy and have been around since forever.

Oh, but all of this is already available in HTTP/1.1, and implemented  
in today's HTTP clients.

In summary, the proposal doesn't add new capabilities to the Web,  
requires all HTTP clients to support the complex RDF syntax,  
increases the minimum number of requests to retrieve a representation  
from one to two, increases bandwidth usage, and removes the format- 
neutrality of the Web infrastructure. I'm afraid it won't be popular.

Yours,
Richard

Received on Monday, 13 November 2006 10:33:48 UTC