W3C home > Mailing lists > Public > www-validator@w3.org > June 2008

Re: Content-Negotiation in check referer requests

From: Etienne Miret <etienne.miret@ens-lyon.fr>
Date: Sun, 15 Jun 2008 10:05:13 +0200
Message-ID: <4854CD39.4040105@ens-lyon.fr>
To: Olivier Thereaux <ot@w3.org>
Cc: www-validator@w3.org

Hello Olivier,

I promised you I’d send those patches two months ago. Sorry I didn’t 
kept my word.

 > Do you think you could make a reduced patch that would only set accept
 > and accept-language in the case of referer validation?
Here is it:
http://www.w3.org/Bugs/Public/attachment.cgi?id=556

This being said, why don’t you want to forward Accept-Charset ? First, 
in case the validator receive a character set it doesn’t know about, it 
will display an appropriate error message. Second, I have a real case 
where this behavior caused an issue:

Most pages on my website are available with two variants:
  * HTML, ISO 8859-15
  * XHTML, UTF-8
The Accept* headers sent by Firefox 3 (at least by the nightly build I 
made the test with) are:
  * Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
  * Accept-Language: en-us,en;q=0.5
  * Accept-Encoding: gzip,deflate
  * Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Since text/html and application/xhtml+xml are sent with the same quality 
factor, my server looks at Accept-Charset in order to decide wich 
variant to send. Obviously it knows that ISO 8859-1 and ISO 8859-15 are 
basically the same charsets, so it sends the HTML variant.

Now, when I made tests with the above patch, the validator would forward 
  only Accept and Accept-Langage patch, which in turn would make Apache 
to send the XHTML variant (for obsure reasons). Hence, hiting a « valid 
HTML » made me see a « this page is valid XHTML » result.

So, in case I convinced you, here is a patch wich forwards all 3 Accept, 
Accept-Language and Accept-Charset:
http://www.w3.org/Bugs/Public/attachment.cgi?id=556

> I got confused by the fact you used the
> http_accept_language param for the templates, while the CGI uses the
> accept_language param, etc. Would it be better to stay consistent here,
> or was there a rationale behind the naming?
I almost sure there was a rationale, but since I can’t remember it, I 
changed this in the two aforementioned patches. The naming is now more 
consistent.


-- 
Etienne Miret
Ne m'envoyez pas de fichier Word SVP, je ne peux pas les lire !
Don't send me Word attachments please, I can't read them!
http://perso.ens-lyon.fr/etienne.miret/Netiquette/no_MS_Office
Received on Sunday, 15 June 2008 08:05:55 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:29 GMT