- From: Zack Weinberg <zweinberg@mozilla.com>
- Date: Tue, 23 Feb 2010 16:36:18 -0800
- To: W3C Emailing list for WWW Style <www-style@w3.org>, fantasai <fantasai.lists@inkedblade.net>
- Message-ID: <20100223163618.435dd1cb@trurl>
I wanted to provide a concrete proposal to deal with the backslash
issues I raised in
http://lists.w3.org/Archives/Public/www-style/2010Feb/0150.html and
http://lists.w3.org/Archives/Public/www-style/2010Feb/0210.html but
was not able to do so in a way that made sense, without rewriting the
whole section. So here is a rewrite of the whole section. :)
I *believe* that the only normative changes are to clarify the behavior
of \-newline not within a string, and \-EOF in any context. However, I
may have made errors. Please let me know if you find any.
The attached diff is not especially readable so I also append the text
that should entirely replace the third bullet point of section 4.1.3.
zw
--- new text ---
<li><p>Backslash (\) characters are not significant inside
<a href="#comments">comments</a>. Elsewhere, they
introduce <span class="index-def" title="backslash
escapes"><a name="escaped-characters"><dfn>character
escapes</dfn></a></span>.</p>
<p>Some character escapes have the effect of inserting a
character into the style sheet, in place of the escape.
Whenever this happens, the inserted character is treated as
either part of an identifier, or part of a string, even if it
normally would have some special meaning. See the examples
below.</p>
<p>If a backslash is immediately followed by the end of the
style sheet, it is a normal character, not an escape.</p>
<ol>
<li><p>Within <a href="#strings">strings</a>, a backslash
followed by a newline is ignored; i.e., the string continues
on the next line, but with neither the backslash nor the
newline included in the string's value. Outside strings, a
backslash followed by a newline is a normal punctuation
character.</p></li>
<li><p>A backslash followed by one to six hexadecimal digits,
[0-9a-fA-F], inserts the ISO 10646
(<a href="refs.html#ref-ISO10646" rel="biblioentry"
class="noxref"><span class="normref">[ISO10646]</span></a>)
character with that number into the style sheet.</p>
<p>One (and only one) white space character is ignored after a
hexadecimal escape of any length. This rule allows authors to
write hexadecimal escapes that are immediately followed by
characters from the set [0-9a-fA-F], without ambiguity. For
instance, <samp>"\26 B"</samp>,
<samp>"\000026B"</samp>, and <samp>"\000026 B"</samp> are
all equivalent to <samp>"&B"</samp>.
However, <samp>"\26B"</samp> is equivalent
to <samp>"ɫ"</samp> (a string containing the single
character U+026B).</p>
<p>If a hexadecimal escape would insert the character with
code point U+0000, the behavior is undefined. Hexadecimal
escapes that are outside the range allowed by Unicode
(e.g. "\110000" stands for a character above the current limit
of U+10FFFF) may be treated as inserting the "replacement
character" (U+FFFD). If such characters are to be displayed,
the UA should show a visible symbol, such as a "missing
character" glyph (cf. <a href="fonts.html#algorithm">15.2,</a>
point 5).</p></li>
<li><p>A backslash followed by any other character (neither a
hexadecimal digit nor a newline) simply removes that
character's special meaning. For instance, <samp>"\""</samp>
is a string consisting of one double quote, <samp>a\:b</samp>
is an identifier consisting of the three characters
<samp>a:b</samp>, and <samp>"te\nt"</samp> is exactly the
same string as <samp>"tent"</samp>. <samp>\7B</samp> is not
punctuation, even though <samp>{</samp> is,
and <samp>\32</samp> is allowed at the start of an
identifier, even though <samp>2</samp> is not).</p></li>
</ol>
<p class="note">Style sheet preprocessors are free to convert
escape sequences to the equivalent characters, or vice versa, as
long as they do not change the style sheet's meaning. For
instance, "\61 b" may be rewritten as "ab"; "a\3a b" may be
rewritten as "a\:b" or vice versa, but not "a:b".</p>
</li>
Attachments
- text/x-patch attachment: backslash-proposal.diff
Received on Wednesday, 24 February 2010 00:36:53 UTC