Re: Content-Negotiation in check referer requests from Etienne Miret on 2008-03-28 (www-validator@w3.org from March 2008)

From: Etienne Miret <etienne.miret@ens-lyon.fr>
Date: Fri, 28 Mar 2008 11:00:44 +0100
To: www-validator@w3.org
Message-ID: <47ECC1CC.3010805@ens-lyon.fr>

Frank Ellermann wrote:
 > As author of pages offering such links I expect that all users
 > with a client supporting "referer" at all get precisely the
 > same validation results.
Why do you want them to see the same validation result, given that they 
don’t see the same document?

On my own websites, most pages are available in both HTML and XHTML and 
content-negotiation is used to decide which format to serve. Most of 
them have a check?uri=referer link, the HTML ones with a "Valid HTML" 
icon and the XHTML ones with a "Valid XHTML" icon. I must say that 
clicking a "Valid XHTML" link and getting a "This page is valid HTML 
4.01" validation result is rather... odd. Just like the opposite.

So, I’d like to submit two patches related to this issue.

The first one is:
<http://perso.ens-lyon.fr/etienne.miret/2008/03/28/negotiate-referer.diff>
It makes use of the already available "accept", "accept-language" and 
"accept-charset" parameters and populate them with the values provided 
by the client *in case of a referer request*. It will also make sure 
those values are kept across revalidation. This makes URI to be very 
long. Sorry.

The headers sent by the client are copied verbatim, that means that the 
validator will send Accept and Accept-Charset headers with types and 
charset it doesn’t support. This is the desired behaviour since a 
check/referer link on a - say - PDF document should trigger an error 
"This document type cannot be validated" even if a HTML/XHTML variant is 
available.

No Accept-Encoding header is sent because in case the validator gets an 
encoding it doesn’t know about, it tries to validate the encoded 
document. This is different from charset and content-type, where the 
validator will display an appropriate error message whenever it gets one 
it doesn't know about.

Beside Accept-Encoding, a server may do content-negotiation with any 
HTTP header, notably User-Agent, and even other informations (like IP 
address). Hence, there is no, and there cannot be any warranty that the 
validated document is the one the user was actually viewing, although my 
patch will help this.

The second patch is:
<http://perso.ens-lyon.fr/etienne.miret/2008/03/28/negotiate-all.diff>
This one brings full content-negotiation support to the validator, 
providing options for setting the Accept, Accept-Language and 
Accept-Charset headers on the home page (and on validation results if 
verbose output is selected). However, it wont send any Accept* headers 
by default.

Note that this patch include the first one.

Comments on any of those patches are welcomed.

Regards,

-- 
Etienne Miret
Ne m'envoyez pas de fichier Word SVP, je ne peux pas les lire !
Don't send me Word attachments please, I can't read them!
http://perso.ens-lyon.fr/etienne.miret/Netiquette/no_MS_Office

Received on Friday, 28 March 2008 10:01:24 UTC