[Bug 12839] @id: Define how Unicode normalization affects the 'unique identifier' status

http://www.w3.org/Bugs/Public/show_bug.cgi?id=12839

--- Comment #12 from Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> 2011-06-04 00:28:28 UTC ---
(In reply to comment #11)

I'm guessing: You are pointing out that the UTF-16 representations in the DOM
can be invalid - per the UTF-16 definition. In much the same way that a UTF-8
file can also be invalid.

A parser must probably be prepared to handle  "UTF-16 artefacts". But if you
are not satisfied with what HTML5 offers in that regard, then   that belongs in
another bug, IMO. Also, it sounds as if you want to change HTML5's wording not
only in the @id definition but also for domstring definitions etc.

The id@ attribute section is also present in the 'HTML5 edition for Web
authors'.  And authors do not need to learn that the identifier will work even
if it contains artefacts that do not belong in the encoding.  

What authors need to be made aware of are gotchas: that lowercase and uppercase
are not treated as equal, and whether decomposed characters are trated as equal
to composed characters.

(Though, if it s possible to explain to authors "artefacts" can also cause an
identifier to be treated as unique, then, that is probably OK.)

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Saturday, 4 June 2011 00:28:30 UTC