- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Mon, 30 May 2011 03:43:00 +0200
- To: "Phillips, Addison" <addison@lab126.com>
- Cc: www-validator@w3.org, www-international@w3.org, public-i18n-core@w3.org
Phillips, Addison, Sun, 29 May 2011 13:54:34 -0700: >> >> As for using non-NFC outside attributes, then I don't know if >> there are issues which can justify a warning. But according >> to Unicode technical report 15, then the "W3C Character Model >> for the World Wide Web [ snip ] and other W3C Specifications >> (such as XML 1.0 5th Edition) recommend using Normalization >> Form C for all content." [4] [...] > The normative bits of Charmod-Norm live at [1]. Items C300 and C301 > use the RFC 2119 keyword "SHOULD" in requiring that content and > specifications be fully-normalized or include-normalized. [...] > It would be unreasonable, in my opinion, to treat HTML5 as a *new* > format, so I think any expectations for adding a normalization > requirement to HTML are unrealistic. However, HTML5 warns against not using UTF-8 because of "unexpected results" in form submissions and links of not doing so. It would seem in tune with this spirit to, if possible, let HTML5/validators point to how to eliminate the problems that can cause unexpected resulted even with UTF-8, no? Btw, it seems to be unclear, from HTML5, whether two @id attributes that only differs with regard to their normalization, are to be considered uniqe. All HTML5 says is said is that @id attributes must be unique, but it is not said what actually makes them unique. [1] Related to the uniqueness: * On the Mac, when serving a file on the preinstalled Apache2, then normalized link values (provided they are not cool IRIs with decomposed letters) do target files with non-normalized file names. How come? Is it because Apache performs a normalization of the HTTP request? * Inside a document, however (with the exception of Safari on windows [2]), then composed and decomposed identifiers are treated by browsers as distinct identifiers, though. [...] > The I18N Core WG has recently agreed > to work on normalization guidelines again. There is (and has ever > been) little enthusiasm for working on the Character Model, but > having read the normalization document again this weekend, I suspect > that Charmod-Norm will probably have to be replaced, rather than just > worked around. Good hear your are looking at it! > [1] http://www.w3.org/TR/charmod-norm/#sec-NormalizationApplication [1] http://dev.w3.org/html5/spec/elements.html#the-id-attribute [2] http://lists.w3.org/Archives/Public/www-validator/2011May/0052 -- Leif H Silli
Received on Monday, 30 May 2011 01:43:29 UTC