Value of content negotiation? [was RE: content negotiation anti-principle] from Jeremy Dunck on 2003-01-08 (www-tag@w3.org from January 2003)

From: Jeremy Dunck <ralinon@hotmail.com>
Date: Tue, 07 Jan 2003 19:29:59 -0600
To: simonstl@simonstl.com, www-tag@w3.org
Message-ID: <BAY1-F66luLCTaogxLD00005368@hotmail.com>
>From: "Simon St.Laurent" <simonstl@simonstl.com>
<snip>
>I'm not nearly as concerned with the "what do I bookmark after a
>retrieval" question as I am with the "how do I as an author reliably
>reference content" question.  Right now reliability is simply not
>enabled unless you control both the point where the reference is made
>and the thing to which reference is made.  Content-negotiation handling
>could be far more robust in markup vocabularies including those created
>by the W3C, but this has clearly NOT been a priority.
>
>It doesn't seem like it should have to be that way, but the
>(non-)intersections of [HTTP capabilities & URI philosophy] and [common
>markup and browser practices] suggest that perhaps the world is more
>comfortable overall with a single pathway from identifier to
>representation.

I don't think that the lack of common usage is due lack of value.  It's a 
question of bang for the buck.  Frankly, I think that tools have never been 
made to really leverage the features offered, and those tools that do exist 
still make it too expensive to accomplish.  We don't suppose that since 
people don't commonly fly to work, there's no interest in that.  We know 
that it'd be great if we could, but the cost of doing so is too high.

I'm not sure that this has too much to do with negligence on the W3C's part, 
though.  That is, I am not sure that the W3C is responsible for the lack of 
standards and tool support.  Given finite resources, something's got to 
give, and a large number of unimplemented (or partially implemented) RFCs on 
the topic exist may indicate this is what's giving.

I think one issue (hiding demand) is that you can't tell if a URI is 
negotiated on the server, and that a requestor can't tell what a particular 
URI is meant to represent.

Specifically, does http://www.example.com/aDoc locate the document (that is, 
body of knowledge) itself, or does it locate the HTML representation of it?  
Is there available a URL http://www.example.com/aDoc.html which -does- 
locate, specifically, the HTML representation of it?

Even if it were readily known that for a particular document, there exists a 
URL for a particular representation of it, would it be generally appropriate 
for a linker to link to the un-negotiated representation?  I don't think it 
would, but there are certainly situations that would warrant it.  For 
example, "Your page, http://www.example.com/aDoc/1-1-2003.en.html returns 
500, but http://www.example.com/aDoc/1-1-2003 seems to work OK, given this 
HTTP request:....."

(That's a contrived example, but hopefully you see my point.)

We'd probably like to hide the negotiation plumbing from Joe User anyway.

I'm not proposing that there should be standardized URLs (breaking opacity), 
but I -do- think there's value in making it visible that a server has 
negotiated for you, and also in making client-side negotiation a real-world 
choice.

The shades of grey are more subtle (and IMO, more important) when you start 
talking about XML documents.  MIME types are really insufficient to 
described multiple-namespaced documents.  At one point, MIME was a pretty 
fine-grained control of what a UA could handle, but the grain is getting 
coarser now.

Sure, you could start creating all kinds of MIME types to emulate the 
underlying requirements of the representation (as all application/*+xml 
MIMEs do, I think), but MIME registration is (rightfully) high ceremony, and 
I think this sort of MIME registration pollutes MIME type's usefulness as 
"an open-ended framework ...[that] can accommodate additional object types, 
character sets, and access methods without any changes to the basic 
protocol." [MIME Type Registration]

Given Accept: text/html, the server can't know for sure if the browser 
supports CSS, JS, DOM, etc.  All kinds of hacks (such as User-Agent string 
sniffing) attempt to address this, but generally do poorly at it.
For negotiation to be useful, fine grained control of acceptable alternates 
is needed, and IMO, commonly used headers today aren't fine enough.

It might be nice for the negotiation to retrieve a (perhaps poorly 
performing) DOM1 script if the request doesn't indicate DOM2 support.   
Taking it to an extreme, given our imperfect implementations, it might be 
nice if negotiation knew whether the getElementByTagName method was 
supported.

Of course, at some point, the whole endeavor just isn't worth the headache.

I feel that for a particular conceptual resource, there may well be multiple 
representations, but I also feel that each of those representations might 
deserve its own URL.

I have another bone to pick that I feel is directly related-- that URLs in 
common web servers are not (or are not completely) abstracted from file 
paths.  That is, in IIS, directories are the finest granularity of URL 
creation-- once a directory is exposed as an URL hierarchy, the files in 
that dir (and only those files) are valid URLs.  Apache provides slightly 
better abstraction, but is still tightly bound to the file system.  CGI 
provides nearly complete abstraction, but other warts in that approach have 
(IMO) limited its acceptance.

Given that URLs are closely mapped to files in a directory (and can't be 
conjured upon request), negotiation and URLs assigned to each possible 
representation quickly becomes unworkable.

But it doesn't have to be that way.  :)

It'd be nice if we could get some reading on the market's opinion of the 
usefulness of content negotiation.  I'd like to change the tools available, 
but I don't know if my time would be well spent in doing so.

I've seen some examples of hack-ish content negotiation (for example, using 
a script to alter text/css content based on UA string), so I know there 
would be -some- demand for it.

Comments?

  Thanks,
    Jeremy Dunck

[MIME Type Registration]
http://www.ietf.org/rfc/rfc2048


_________________________________________________________________
MSN 8 with e-mail virus protection service: 2 months FREE* 
http://join.msn.com/?page=features/virus
Received on Tuesday, 7 January 2003 20:30:36 UTC