RE: HTML5 and Unicode Normalization Form C

Koji Ishii, Mon, 30 May 2011 04:21:45 -0400:
>> Koji Ishii, Sun, 29 May 2011 22:10:29 -0400:
>>> It looks like all Leif cares is URL.
>> 
>> All? As in "nothing more than"?
> 
> Ah...I apologize  [ snip ]

No problem. And it is true that my main focus is on linking.

>>> I think it'd make sense for HTML5 spec and validator to follow
>>> URL/IRI spec for attributes that contain URL/IRI.
>> 
>> Do you expect text editors to encode content of attributes differnetly
>> from content of other parts of the text file?
> 
> Yes for validators. URL/IRI has syntax like encoding using "%", so 
> validation of attribute values using its data type makes sense to me. 
> If it wasn't the goal of the HTML5 validator, or if I'm asking too 
> much, I'm sorry for that.

HTML5 supports IRIs, which: [1] "Allows native representation of 
Unicode in resources without % escaping". Or put differently: [2] "the 
desired Web address is stored in a document link or typed into the 
client's address bar using the relevant native characters".

> But you're right that it could be a hard requirement for editors. If 
> we take it seriously, I guess we have to wait Unicode to fix NFC 
> problems (I heard the effort is going on) or to ask web 
> browsers/servers to normalize on the fly. All options we have today 
> have trade-offs, and I just wanted you to be aware of that 
> normalizing whole contents today can harm some scripts.

Which scripts could such a thing harm?

>>> Whether to apply NFC/NFD to whole contents or not seems to be a
>>> little separate issue to me.
>> 
>> This thread started on www-validator@ and did not speak about "whole
>> contents" or not - it only dealt with the fact that the HTML5 validator
>> issued an error for non-NFC content. I have also seen that same error,
>> and I thought - then - that it was based on HTML5.
>> 
>> However, it has to be said that it was only after Andreas Prilop
>> pointed out that the HTML5 validator issues the same error message
>> inside as well as outside attributes, that I understood that it - in
>> contrast to what I thought - was not a restriction that was
>> particularly related to links.
>> 
>> As it has turned out, however, it was an error of the HTML5 validator
>> to show an error for use of NFC. But *that* only increases the
>> importance of offer helpful recommendations w.r.t. links.
> 
> Thank you for the explanation of the background I wasn't aware of.

I should have pointed it out when I CC-ed this list. Sorry.

> I 
> agree that links have problems you raised, and NFC can solve it. All 
> I want you to understand is that applying NFC to displayable contents 
> has some different problems, so what we said do not contradict to 
> each other I think, and I wanted to find a solution that can make 
> both of us happy.

Agree!

[1] 
http://download.microsoft.com/download/a/6/0/a60decbd-9044-42f1-b9c5-1c90c7a5a8ce/a6.pdf
[2] http://www.w3.org/International/articles/idn-and-iri/#idnoverview
-- 
Leif H Silli

Received on Monday, 30 May 2011 14:16:19 UTC