Re: Content Sniffing impact on HTTPbis - #155

On Fri, 5 Jun 2009, Bjoern Hoehrmann wrote:
> 
> I see no justification for having a special algorithm for the charset 
> parameter; you extract the parameter just like any other. I also don't 
> know of any implementation that processes the header value like that; if 
> you have
> 
>   text/plain;whatever="charset=iso-8859-2";charset=iso-8859-3
> 
> Then the result of your algorithm is iso-8859-2", while the correct be- 
> havior yields iso-8859-3, which is also what IE6, FF 3.x, Opera 9, and 
> various non-browser applications use. The same goes for a simpler:
> 
>   text/plain;whatever="charset";charset=iso-8859-3
> 
> Where your algorithm returns nothing, and implementations implement the 
> correct behavior, which yields iso-8859-3. There also appears to be no 
> need to process escape sequences within quoted strings incorrectly, for 
> instance Opera 9 seems to implement that properly, so does my own code.

My testing at the time was written disagrees with the results of your 
testing. I believe this was primarily intended for charset extraction for 
<script> nodes, if that matters. However, if your results can be confirmed 
then that would certainly be good news.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Friday, 5 June 2009 19:06:31 UTC