RE: HTML5 and Unicode Normalization Form C

Koji Ishii, Sun, 29 May 2011 22:10:29 -0400:
> It looks like all Leif cares is URL.

All? As in "nothing more than"? 

On the contrary, to squarely look at URLs could mean that one looses 
compatibility. (E.g. if example.org/fönt is encoded in a way that is 
incompatible with the way your keyboard etc works.)

> Shouldn't it be covered in 
> URL/IRI spec rather than in HTML 5 spec? I haven't read in depth but 
> RFC 3987[1] mentions normalizations in IRI.

HTML5 does tries to define its own subset of that. See: 
http://www.w3.org/html/wg/href/draft (And HTML5 itself.)

> I think it'd make sense for HTML5 spec and validator to follow 
> URL/IRI spec for attributes that contain URL/IRI.

Do you expect text editors to encode content of attributes differnetly 
from content of other parts of the text file?

> Whether to apply NFC/NFD to whole contents or not seems to be a 
> little separate issue to me.

This thread started on www-validator@ and did not speak about "whole 
contents" or not - it only dealt with the fact that the HTML5 validator 
issued an error for non-NFC content. I have also seen that same error, 
and I thought - then - that it was based on HTML5.

However, it has to be said that it was only after Andreas Prilop 
pointed out that the HTML5 validator issues the same error message 
inside as well as outside attributes, that I understood that it - in 
contrast to what I thought - was not a restriction that was 
particularly related to links.

As it has turned out, however, it was an error of the HTML5 validator 
to show an error for use of NFC. But *that* only increases the 
importance of offer helpful recommendations w.r.t. links.

> [1] http://www.ietf.org/rfc/rfc3987.txt

-- 
Leif Halvard Silli

Received on Monday, 30 May 2011 02:58:10 UTC