- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Thu, 11 Feb 2010 16:36:20 +0100
- To: Richard Ishida <ishida@w3.org>
- Cc: www-international@w3.org
Upon rereading what CSS21 says about escaping, I think the entire
paragraph on CSS escaping takes some simplifications that perhaps are
not so simplifying:
]] CSS. The escape mechanism for representing characters in CSS is a
backslash followed by a hexadecimal number representing the Unicode
code point value. Note that these escapes are terminated by a space,
rather than a semi-colon. [[
Firstly, using a single (white)space character as a termination
character is something that I think many find confusing in itself. And
I think that this is indirectly confirmed by the Charmod document
<http://www.w3.org/TR/2005/REC-charmod-20050215/>, which says:
]] C044 [S] Escape syntax should require either explicit end
delimiters or a fixed number of characters in each character escape.
Escape syntaxes where the end is determined by any character outside
the set of characters admissible in the character escape itself should
be avoided.
These character escapes are not clear visually, and can cause an editor
to insert spurious line-breaks when word-wrapping on spaces. Forms like
SPREAD's &UABCD; [SPREAD] or XML's &#xhhhh;, where the character escape
is explicitly terminated by a semicolon, are much better. [[
The Charmod document doesn't discuss (white)space as termination
character, but it seems evident that (white)space could be unclear
visually - it is difficult to separate a termination space from a
"normal" space, something which CSS21 notes when it specifies how
Unicode escapes may be terminated:
]]
If a character in the range [0-9a-fA-F] follows the hexadecimal
number, the end of the number needs to be made clear. There are
two ways to do that:
1. with a space (or other white space character): "\26 B"
("&B"). In this case, user agents should treat a "CR/LF" pair
(U+000D/U+000A) as a single white space character.
2. by providing exactly 6 hexadecimal digits: "\000026B" ("&B")
In fact, these two methods may be combined. Only one white
space character is ignored after a hexadecimal escape. Note
that this means that a "real" space after the escape sequence
must itself either be escaped or doubled.
[[
Rather than trying to make CSS escape termination analogous with HTML
NCRs by recommending to use a termination character, it seems simpler
to me to recommend authors to provide exactly 6 hexadecimal digits, as
then one do not need to use the confusing whitespace terminator.
--
leif halvard silli
Received on Thursday, 11 February 2010 15:36:56 UTC