Re: How browsers display IRI's with mixed encodings

On 7/21/2011 5:29 PM, Jungshik Shin (신정식, 申政湜) wrote:
> I think your html page declared its encoding to be in ISO-8859-1. Then,
> it's not an mixed encoding because xEF xBC xA1 is a perfectly fine
> ISO-8859-1 sequence.

Right, I stated "mixed encodings" purposely as a misnomer of sorts, 
which I thought I may have alluded to by mentioning the test reference 
included "bytes representing UTF-8" within an iso-8859-1 encoded document.

> The above is Chrome's internal representation of the URL in question
> (aside from the spec+ host part). When displaying the URL in the
> omnibox,  the path part is always interpreted as UTF-8. The query part
> is tested for 'UTF8ness' (after unescaping). If it *can* be interpreted
> as UTF-8, it's converted to characters. Otherwise, it remains %-escaped
> in the display.

That was the point of the test, which I may have failed at trying to 
describe.  The point being to test that display of the path and query 
parts when they contain unescaped 'UTF8ness'.

Thanks for the feedback,
-Chris

Received on Friday, 22 July 2011 02:30:30 UTC