Re: Question about the On Linking Alternative Representations TAG Finding from Richard Cyganiak on 2008-08-07 (www-tag@w3.org from August 2008)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Thu, 7 Aug 2008 19:29:53 +0100
To: "Sebastien Lambla" <seb@serialseb.com>
Cc: "T.V Raman" <raman@google.com>, <john.kemp@nokia.com>, <www-tag@w3.org>, <kidehen@openlinksw.com>, <tthibodeau@openlinksw.com>
Message-Id: <792C2050-F5CE-4C32-9BA0-91F2D3D557A7@cyganiak.de>
Sebastien,

I'll try to explain below. Short summary: 303 redirects are about  
creating URIs for “things described inside documents”. Content  
negotiation is about having the same document in different formats.  
The 303 approach used in the Linked Data community combines the 303  
redirects and content negotiation in a somewhat sloppy but mostly- 
harmless way.

On 7 Aug 2008, at 16:04, Sebastien Lambla wrote:
> By content negociation I took it to cover both 2616 negociation and  
> the process by which httpRange-14 hints at redirecting to an IR from  
> a resource that may or may not be an IR.
>
> Maybe I misunderstood the common practice of 303ing from resources  
> that are not documents as being supported by conneg, but maybe my  
> understanding has been flawed by listening to too many people's  
> opinion on httpRange-14 (as it seems everyone has one these days).
>
> Any enlightment would be greatly appreciated.

The practice of 303-redirecting, as used by parts of the RDF  
community, is motivated by the desire to assign URIs not just (like on  
the WWW) to documents, but also to the things that are *described in  
the documents*, such as people, cities, products and so on.

There was a lot of fighting about the best way to assign those URIs in  
a way that doesn't jeopardize the existing Web. In the end, some  
influential people insisted on a rule that has become the axiom now  
known as the “httpRange-14 decision”:

    If a resource has a representation, then it is a document. If it  
doesn't have a representation, it could be anything.

 From this axiom comes the requirement to have URIs that do not  
resolve to representations. In traditional, WWW-style Web  
architecture, that would be a weird idea; it's all about exchanging  
representations. But RDF people want it.

So, we proposed two approaches to fulfill this requirement: the  
practice of using hash URIs (if </foaf.rdf> talks about a person, then  
that person could have the URI </foaf.rdf#me>); and 303 redirects (if  
</about/Berlin> is a document that talks about a city, then that city  
could have the URI </resource/Berlin> and 303-redirect to the document).

Let's ignore the hash URIs for a moment. The key to the 303 approach  
is this: It allows us to have resources that does not have a  
representation, but still continue the HTTP conversation to get to a  
document that describes the resource. So the idea is:

     some_resource
        |
        +--303--> description_of_some_resource

Now, <description_of_some_resource> could be just an RDF document; as  
you see, in its basic form, the 303 approach doesn't involve any  
content negotiation or different formats.

But then, <description_of_some_resource> is a perfectly normal Web  
document, and as such it can be available in different formats,  
languages, and so on. A fairly common scenario is to have an HTML  
variant for Web browsers and an RDF variant for data browsers. If we  
follow the advice from Raman's TAG Finding, we would simply make  
<description_of_some_resource> a generic resource; and the two  
variants might be called <description_of_some_resource.rdf> and  
<description_of_some_resource.html>. If <description_of_some_resource>  
is requested, the appropriate variant is 200-returned to the client:

     some_resource
        |
        +--303--> description_of_some_resource
                     |
                     +--Content-Location-->  
description_of_some_resource.{html|rdf}

That's the clean and proper way of combining the 303 approach with  
content negotiation!

Now, for a bunch of mostly historical reasons, people often omit the  
<description_of_some_resource> resource, and rather 303-redirect  
directly from the <some_resource> URI to  
<description_of_some_resource.rdf> or  
<description_of_some_resource.html>. So, they do not set up a generic  
resource, but rather they create two different descriptions for the  
different kinds of browsers.

     some_resource
        |
        +--303--> description_of_some_resource.{html|rdf}

In the context of the 303 approach, this is mostly harmless, because  
there is a redirect involved anyway, so this solution doesn't cause  
additional redirects and even looks a bit simpler than the previous  
one. It is now widely deployed by Linked Data publishers. The  
practical consequences of this simplification are small, I think, and  
therefore it's not really worth insisting on the “proper way” above.  
(Or maybe it is?)

Unfortunate side effect: Many people got introduced to content  
negotiation through this 303 practice. Now they assume that content  
negotiation is always done with a redirect. Wrong! That would be bad  
practice outside of the specific context of 303 redirects.

Finally, let me re-iterate that I prefer hash URIs over 303 URIs. They  
are easier to understand (“name something that is described inside the  
document by appending #something”), they are easier to implement, they  
do not require a redirect, they do not cause endless discussions about  
angels on pinheads etc etc...

Best,
Richard




>
>
> Sebastien
>
>
>
>
> --------------------------------------------------
> From: "Richard Cyganiak" <richard@cyganiak.de>
> Sent: Thursday, August 07, 2008 3:34 PM
> To: "Sebastien Lambla" <seb@serialseb.com>
> Cc: "T.V Raman" <raman@google.com>; <john.kemp@nokia.com>; <www-tag@w3.org 
> >; <kidehen@openlinksw.com>; <tthibodeau@openlinksw.com>
> Subject: Re: Question about the On Linking Alternative  
> Representations TAG Finding
>
>>
>> Sebastien,
>>
>> Just a side note.
>>
>> Content negotiation as defined in the HTTP spec does *not* involve  
>> redirects.
>>
>> Content negotiation works by serving the appropriate variant  
>> directly  at the request URI, along with an optional Content- 
>> Location header  that gives a URI for the specific selected variant.
>>
>> There is a recent bad meme floating around, about implementing  
>> content negotiation by redirecting from one URI to another. This is  
>> not a good way of implementing content negotiation. In web  
>> applications, response time is key. Therefore, redirects should be  
>> avoided if possible.  Actual content negotiation implementations,  
>> such as mod_negotiation in  Apache, use the redirect-less approach  
>> described in the HTTP spec.
>>
>> (I think the meme is an unfortunate result of people getting  
>> confused  by the 303 redirects in the httpRange-14 debate. Again,  
>> content  negotiation does *not* require redirects and should be  
>> done without  when possible. The 303 approach uses redirects for a  
>> different reason.)
>>
>> Richard
>>
>>
>>
>> On 7 Aug 2008, at 14:39, Sebastien Lambla wrote:
>>
>>> So to get in context, if a generic resource redirects to a  
>>> variation with its own URL that will return a 200, hence an IR,  
>>> one argues  that conneg should still be possible from the IR to  
>>> another IR  related to the original generic resource?
>>>
>>> I argue that one should only allow conneg within the new scope   
>>> allowed by the IR, so that Conneg on /genericresource with accept:  
>>> application/html+xml will redirect to /genericresource.html, but  
>>> if requested as text/plain should fail. /genericresource.html  
>>> should however be able to conneg on other variables such as the  
>>> language  and may redirect to /genericresource(en).html
>>>
>>> In that definition there is no real generic resource, there is a   
>>> chain of resources that are more and more specialized, reducing  
>>> at  each step the range you can conneg against.
>>>
>>> Seb
>>>
>>>
>>>
>>> --------------------------------------------------
>>> From: "T.V Raman" <raman@google.com>
>>> Sent: Thursday, August 07, 2008 2:20 PM
>>> To: <john.kemp@nokia.com>
>>> Cc: <raman@google.com>; <richard@cyganiak.de>;  
>>> <seb@serialseb.com>; <www-tag@w3.org
>>> >; <kidehen@openlinksw.com>; <tthibodeau@openlinksw.com>
>>> Subject: Re: Question about the On Linking Alternative   
>>> Representations TAG Finding
>>>
>>>> Correct, that is why I carefully separated out user-agents that
>>>> send accept=*/* from other types of agents. When a user-agent
>>>> sends out an explicit list of mime-types that it will accept for
>>>> content negotiation I think the client and server should do full
>>>> content negotiation as was originally intended by HTTP's content
>>>> negotiation scheme.
>>>>
>>>> John Kemp (Nokia-S&S/Williamstown) writes:
>>>> > ext T.V Raman wrote:
>>>> >
>>>> > [...]
>>>> >
>>>> > > Returning to your final question, where the user-agent does
>>>> > > content-negotiation, indicates a preference for one type, but
>>>> > > asks by URI for the other, I would say respect the URI. I dont
>>>> > > claim this to be *correct* in any sense, other than that I  
>>>> would
>>>> > > break the tie this way. Reasoning: The client, by asking for a
>>>> > > URI that directly resolves to a given representation has
>>>> > > essentially bypassed content-negotiation.
>>>> >
>>>> > I think your interpretation is OK. But other servers may wish to
>>>> respect
>>>> > the HTTP Accept header sent in the request, rather than (or in
>>>> addition
>>>> > to) parsing the URI. This is server-driven negotiation, and the
>>>> server
>>>> > is attempting to meet the needs of its client. If the server  
>>>> feels
>>>> > unable to adequately determine what the client wants, it may
>>>> return an
>>>> > HTTP 303 or 406 status code and allow the client to make a  
>>>> choice > itself.
>>>> >
>>>> > All of that is in the HTTP 1.1 specification. Anything other than
>>>> HTTP
>>>> > would presumably define a similar mechanism.
>>>> >
>>>> > I believe it makes sense to recommend that HTTP 1.1 content
>>>> negotiation
>>>> > via the HTTP Accept: header is the preferred mechanism for
>>>> "breaking the
>>>> > tie". If the user-agent can set the Accept header value to
>>>> something
>>>> > more specific than */* then it is already likely capable of
>>>> setting the
>>>> > _correct_ value for this header to get the content type it is
>>>> asking > for.
>>>> >
>>>> > Regards,
>>>> >
>>>> > - johnk
>>>>
>>>> -- 
>>>> Best Regards,
>>>> --raman
>>>>
>>>> Title:  Research Scientist
>>>> Email:  raman@google.com
>>>> WWW:    http://emacspeak.sf.net/raman/
>>>> Google: tv+raman
>>>> GTalk:  raman@google.com, tv.raman.tv@gmail.com
>>>> PGP:    http://emacspeak.sf.net/raman/raman-almaden.asc
>>>>
>>
Received on Thursday, 7 August 2008 18:30:36 UTC