- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Fri, 03 May 2013 17:42:05 +0300
- To: Anon SU <anonymous84327@gmail.com>
- CC: www-validator@w3.org
2013-05-03 2:04, Anon SU wrote: > I'm getting the following error: *Document uses the Unicode Private Use > Area(s), which should not be used in publicly exchanged documents. > (Charmod C073)* It is a warning, not an error message. A minimal document that triggers the warning is <!DOCTYPE html> <title></title>  As far as I can see, there is nothing in the HTML5 CR or in the WHATWG Living HTML document that justifies the warning. I cannot find any statement about the allowed set of characters in HTML serialization. For XHTML serialization, generic XML rules apply, and they do not disallow Private Use characters (on the contrary, the explicit rule for allowed characters allows them, and there is no recommendation against them either in XML, as far as data characters are considered). > Why shouldn't Unicode PUA be used? What's wrong with them?? Apparently "Charmod" in the message refers to "Character Model for the World Wide Web 1.0: Fundamentals", http://www.w3.org/TR/charmod/ which is a W3C Recommendation and contains clause 4.5 about Private Use code points. There item C073 says: "Publicly interchanged content SHOULD NOT use codepoints in the private use area." This is farely natural on the basis of the very concept of Private Use: private use code points are meaningless outside the scope of a private agreement, and different agreements may have different definitions for them. However, "Charmod" is about the WWW, and HTML5 is not limited to the WWW. So the warning should be read as relating to possible use of a document on the WWW or in other public interchange. > I'm using font-based icons from IcoMoon ( http://icomoon.io/app/ ). Well, they shouldn't use Private Use code points. Checking what the validator http://validator.w3.org/nu/ says about some some points, I made the following observations: Code points U+0005, U+000B, U+000E, U+007F, U+0086, U+FDD0, U+FFFE are reported as "forbidden", in error messages. I cannot find a justification for this in HTML5 CR for HTML serialization. (I can see many reasons why they *should* be avoided and perhaps even be made forbidden, but that's a different issue.). For XHTML serialization, the report is partly correct, but U+007F, U+0086, U+FDD0 are not forbidden in XML, just discouraged. Yucca
Received on Friday, 3 May 2013 14:42:32 UTC