- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Thu, 11 Feb 2010 16:36:20 +0100
- To: Richard Ishida <ishida@w3.org>
- Cc: www-international@w3.org
Upon rereading what CSS21 says about escaping, I think the entire paragraph on CSS escaping takes some simplifications that perhaps are not so simplifying: ]] CSS. The escape mechanism for representing characters in CSS is a backslash followed by a hexadecimal number representing the Unicode code point value. Note that these escapes are terminated by a space, rather than a semi-colon. [[ Firstly, using a single (white)space character as a termination character is something that I think many find confusing in itself. And I think that this is indirectly confirmed by the Charmod document <http://www.w3.org/TR/2005/REC-charmod-20050215/>, which says: ]] C044 [S] Escape syntax should require either explicit end delimiters or a fixed number of characters in each character escape. Escape syntaxes where the end is determined by any character outside the set of characters admissible in the character escape itself should be avoided. These character escapes are not clear visually, and can cause an editor to insert spurious line-breaks when word-wrapping on spaces. Forms like SPREAD's &UABCD; [SPREAD] or XML's &#xhhhh;, where the character escape is explicitly terminated by a semicolon, are much better. [[ The Charmod document doesn't discuss (white)space as termination character, but it seems evident that (white)space could be unclear visually - it is difficult to separate a termination space from a "normal" space, something which CSS21 notes when it specifies how Unicode escapes may be terminated: ]] If a character in the range [0-9a-fA-F] follows the hexadecimal number, the end of the number needs to be made clear. There are two ways to do that: 1. with a space (or other white space character): "\26 B" ("&B"). In this case, user agents should treat a "CR/LF" pair (U+000D/U+000A) as a single white space character. 2. by providing exactly 6 hexadecimal digits: "\000026B" ("&B") In fact, these two methods may be combined. Only one white space character is ignored after a hexadecimal escape. Note that this means that a "real" space after the escape sequence must itself either be escaped or doubled. [[ Rather than trying to make CSS escape termination analogous with HTML NCRs by recommending to use a termination character, it seems simpler to me to recommend authors to provide exactly 6 hexadecimal digits, as then one do not need to use the confusing whitespace terminator. -- leif halvard silli
Received on Thursday, 11 February 2010 15:36:56 UTC