- From: Richard Ishida <ishida@w3.org>
- Date: Tue, 2 Dec 2003 14:53:57 -0000
- To: "'Etan Wexler (by way of Martin Duerst <duerst@w3.org>)'" <etanwexler@comcast.net>, <www-international@w3.org>, <w3c-css-wg@w3.org>, <w3c-i18n-ig@w3.org>
Etan,
Many thanks for this clear expos.
I wonder whether CSS can introduce a change to CSS2.1 at this stage to
clarify that the BOM - particularly any UTF-8 signature - should not be
considered part of the following text.
Comment from CSS WG welcome.
RI
============
Richard Ishida
W3C
contact info: http://www.w3.org/People/Ishida/
http://www.w3.org/International/
http://www.w3.org/International/geo/
W3C Internationalization FAQs
http://www.w3.org/International/questions.html
RSS feed: http://www.w3.org/International/questions.rss
> -----Original Message-----
> From: www-international-request@w3.org
> [mailto:www-international-request@w3.org] On Behalf Of Etan
> Wexler (by way of Martin Duerst <duerst@w3.org>)
> Sent: 29 November 2003 14:15
> To: www-international@w3.org
> Subject: Re: UTF-8 signature / BOM in CSS
>
>
>
>
>
>
> Richard Ishida wrote to <mailto:www-international@w3.org> on
> 27 November
> 2003 in "New test page: UTF-8 signature / BOM"
> (<mid:004c01c3b4fe$6e7414e0$6601a8c0@w3c40upc3ma3j2>):
>
> >[ Note that I've also seen the first line or so of external
> CSS style
> >sheets fail if a utf-8 signature is present. If I can
> remember how to
> >replicate the failure, I'll write another test file to cover that. ]
>
> The effect that you observe with Cascading Style Sheets is
> not a failure
> according to the CSS2 Recommendation. In short, the byte
> order mark (U+FEFF
> zero width no-break space) counts as an identifier component.
>
> CSS level 2 specifies that any character from U+00A1 to
> U+FFFFFF can appear
> bare in an identifier or starting an identifier [CAC2]. Level
> 2.1 (a work
> in progress as of 27 November 2003) has the same allowance
> [CAC21]. Suppose
> I have a single-ruleset style sheet:
>
> td { padding: 1ex; }
>
> Now suppose that my CSS editor prepends a BOM to the style
> sheet. According
> to specification, the effect should be the same as if the
> style sheet were:
>
> \FEFFtd { padding: 1ex; }
>
> In other words, the CSS engine has a selector that matches
> against any
> element whose element-type name is the sequence
>
> U+FEFF, U+0074, U+0064.
>
> The selector must not match against "td" elements.
>
> The syntax module in level 3 (a work in progress as of 27
> November 2003)
> [SYN3] is adapting to the times by allowing an initial U+FEFF as an
> encoding signature rather than as an identifier character:
>
> "A byte order mark (BOM), as described in section 2.7 of
> [UNICODE310], that
> begins the sequence of characters should not be considered,
> for purposes of
> applying the grammar below, as a part of the style sheet."
>
> CSS level 1 didn't allow U+FEFF to appear in style sheets
> (although its
> representation through numeric escapes was permitted) [SYN1]. This is
> mostly a historical footnote; CSS level 1, although officially a
> Recommendation, has the effective status of a superseded Candidate
> Recommendation.
>
> [CAC2]
> Bert Bos; H̝on Wium Lie; Chris Lilley; Ian Jacobs.
> "Characters and case", section 4.1.3 of CSS level 2
> specification. W3C Recommendation. 12 May 1998.
<http://www.w3.org/TR/REC-CSS2/syndata.html#q4>.
[CAC21]
Bert Bos; Tantek elik; Ian Hickson; H̝on Wium Lie.
"Characters and case", section 4.1.3 of CSS level 2.1 specification. W3C
Working Draft. 15 September 2003.
<http://www.w3.org/TR/2003/WD-CSS21-20030915/syndata.html#q6>
[SYN3]
L. David Baron, editor.
"CSS style sheet representation", section 3 of CSS3 syntax module. W3C
Working Draft. 13 August 2003.
<http://www.w3.org/TR/2003/WD-css3-syntax-20030813/#css-style>.
[SYN1]
H̝on Wium Lie;Bert Bos.
"CSS1 grammar", Appendix B of revised CSS1 specification.
W3C Recommendation.
11 January 1999.
<http://www.w3.org/TR/REC-CSS1#appendix-b>.
Received on Tuesday, 2 December 2003 09:53:59 UTC