W3C home > Mailing lists > Public > uri@w3.org > July 2011

Re: uri templates: NFKC or NFC

From: Chris Weber <chris@lookout.net>
Date: Thu, 14 Jul 2011 16:45:24 -0700
Message-ID: <4E1F7F94.5010805@lookout.net>
To: uri@w3.org
On 7/14/2011 4:31 PM, Roy T. Fielding wrote:
> The URI Templates draft currently requires use of the NFKC for
> normalization of Unicode strings.  I've never understood why
> that is, considering that IRI does no require it and
> browsers appear to use NFC (if anything).  Also, it should only
> apply to the expansions -- the literal parts don't need to be
> normalized.
>
> Should I change it to NFC?
>
> ....Roy
>

 From my recent test results, Safari was the only browser applying NFC 
to an IRI path, query, and fragment parts.  Chrome applied NFC to the 
fragment part only, and the others did not apply NFC anywhere.  Needless 
to say this results in an interop problem.  An overview of the results 
are up at:

https://spreadsheets.google.com/spreadsheet/ccc?key=0AifoWoA0trUndEZSTlRRNnd5MzE3N3RYOVlIVFFMREE&hl=en_US#gid=5

And raw results including the test case fragments are up at:

https://spreadsheets.google.com/spreadsheet/ccc?key=0AifoWoA0trUndEZSTlRRNnd5MzE3N3RYOVlIVFFMREE&hl=en_US#gid=3

I was testing browsers in HTML Quirks mode using a UTF-8 charset 
declaration set by the HTTP Content-Type header.  My observations were 
based on a) the way an anchor href was parsed in the DOM, and b) the way 
the HTTP GET request was sent on the wire.

Best regards,
Chris
Received on Saturday, 16 July 2011 11:33:28 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 16 July 2011 11:33:30 GMT