- From: Chris Lilley <chris@w3.org>
- Date: Mon, 27 Mar 2006 18:17:12 +0200
- To: "www-i18n-comments@w3.org" <www-i18n-comments@w3.org>
Hello www-i18n-comments, In Character Model for the World Wide Web 1.0: Fundamentals we read: http://www.w3.org/TR/2005/REC-charmod-20050215/#C043 C043 [S] The number of different ways to escape a character SHOULD be minimized (ideally to one). A well-known counter-example is that for historical reasons, both HTML and XML have redundant decimal (&#ddddd;) and hexadecimal (&#xhhhh;) character escapes. Yes. Given that XML does, as noted, have both of them, we find that http://www.w3.org/TR/2005/REC-charmod-20050215/#C048 C048 [I] [C] Content SHOULD use the hexadecimal form of character escapes rather than the decimal form when there are both. NOTE: The hexadecimal form is preferred because character encoding standards (in particular Unicode) usually list character numbers as hexadecimal, making lookup easier. to be overly strong. Its certainly sound advice for hand authors, and a content creation tool might well be coded up to choose hex rather than decimal escapes, since it makes no particular difference which to use. Requiring all content to use hex NCRs, though, seems rather strong. Saying that software which emits XML does not conform because it allows decimal NCRs to be generated is also overly strong - fair enough for NCRs that are machine generated, but if the author put them in then software has no real business changing them. It slightly increases readability (though not as much as using the actual character does), but so does a two-character indent or other forms of pretty printing. -- Chris Lilley mailto:chris@w3.org Chair, W3C SVG Working Group W3C Graphics Activity Lead Co-Chair, W3C Hypertext CG
Received on Monday, 27 March 2006 16:17:12 UTC