Re: uri templates: NFKC or NFC from Chris Weber on 2011-07-14 (uri@w3.org from July 2011)

From: Chris Weber <chris@lookout.net>
Date: Thu, 14 Jul 2011 16:45:24 -0700
To: uri@w3.org
Message-ID: <4E1F7F94.5010805@lookout.net>

On 7/14/2011 4:31 PM, Roy T. Fielding wrote:
> The URI Templates draft currently requires use of the NFKC for
> normalization of Unicode strings.  I've never understood why
> that is, considering that IRI does no require it and
> browsers appear to use NFC (if anything).  Also, it should only
> apply to the expansions -- the literal parts don't need to be
> normalized.
>
> Should I change it to NFC?
>
> ....Roy
>

 From my recent test results, Safari was the only browser applying NFC 
to an IRI path, query, and fragment parts.  Chrome applied NFC to the 
fragment part only, and the others did not apply NFC anywhere.  Needless 
to say this results in an interop problem.  An overview of the results 
are up at:

https://spreadsheets.google.com/spreadsheet/ccc?key=0AifoWoA0trUndEZSTlRRNnd5MzE3N3RYOVlIVFFMREE&hl=en_US#gid=5

And raw results including the test case fragments are up at:

https://spreadsheets.google.com/spreadsheet/ccc?key=0AifoWoA0trUndEZSTlRRNnd5MzE3N3RYOVlIVFFMREE&hl=en_US#gid=3

I was testing browsers in HTML Quirks mode using a UTF-8 charset 
declaration set by the HTTP Content-Type header.  My observations were 
based on a) the way an anchor href was parsed in the DOM, and b) the way 
the HTTP GET request was sent on the wire.

Best regards,
Chris

Received on Saturday, 16 July 2011 11:33:28 UTC