- From: Richard Ishida <ishida@w3.org>
- Date: Tue, 2 Dec 2003 14:53:57 -0000
- To: "'Etan Wexler (by way of Martin Duerst <duerst@w3.org>)'" <etanwexler@comcast.net>, <www-international@w3.org>, <w3c-css-wg@w3.org>, <w3c-i18n-ig@w3.org>
Etan, Many thanks for this clear expos. I wonder whether CSS can introduce a change to CSS2.1 at this stage to clarify that the BOM - particularly any UTF-8 signature - should not be considered part of the following text. Comment from CSS WG welcome. RI ============ Richard Ishida W3C contact info: http://www.w3.org/People/Ishida/ http://www.w3.org/International/ http://www.w3.org/International/geo/ W3C Internationalization FAQs http://www.w3.org/International/questions.html RSS feed: http://www.w3.org/International/questions.rss > -----Original Message----- > From: www-international-request@w3.org > [mailto:www-international-request@w3.org] On Behalf Of Etan > Wexler (by way of Martin Duerst <duerst@w3.org>) > Sent: 29 November 2003 14:15 > To: www-international@w3.org > Subject: Re: UTF-8 signature / BOM in CSS > > > > > > > Richard Ishida wrote to <mailto:www-international@w3.org> on > 27 November > 2003 in "New test page: UTF-8 signature / BOM" > (<mid:004c01c3b4fe$6e7414e0$6601a8c0@w3c40upc3ma3j2>): > > >[ Note that I've also seen the first line or so of external > CSS style > >sheets fail if a utf-8 signature is present. If I can > remember how to > >replicate the failure, I'll write another test file to cover that. ] > > The effect that you observe with Cascading Style Sheets is > not a failure > according to the CSS2 Recommendation. In short, the byte > order mark (U+FEFF > zero width no-break space) counts as an identifier component. > > CSS level 2 specifies that any character from U+00A1 to > U+FFFFFF can appear > bare in an identifier or starting an identifier [CAC2]. Level > 2.1 (a work > in progress as of 27 November 2003) has the same allowance > [CAC21]. Suppose > I have a single-ruleset style sheet: > > td { padding: 1ex; } > > Now suppose that my CSS editor prepends a BOM to the style > sheet. According > to specification, the effect should be the same as if the > style sheet were: > > \FEFFtd { padding: 1ex; } > > In other words, the CSS engine has a selector that matches > against any > element whose element-type name is the sequence > > U+FEFF, U+0074, U+0064. > > The selector must not match against "td" elements. > > The syntax module in level 3 (a work in progress as of 27 > November 2003) > [SYN3] is adapting to the times by allowing an initial U+FEFF as an > encoding signature rather than as an identifier character: > > "A byte order mark (BOM), as described in section 2.7 of > [UNICODE310], that > begins the sequence of characters should not be considered, > for purposes of > applying the grammar below, as a part of the style sheet." > > CSS level 1 didn't allow U+FEFF to appear in style sheets > (although its > representation through numeric escapes was permitted) [SYN1]. This is > mostly a historical footnote; CSS level 1, although officially a > Recommendation, has the effective status of a superseded Candidate > Recommendation. > > [CAC2] > Bert Bos; H̝on Wium Lie; Chris Lilley; Ian Jacobs. > "Characters and case", section 4.1.3 of CSS level 2 > specification. W3C Recommendation. 12 May 1998. <http://www.w3.org/TR/REC-CSS2/syndata.html#q4>. [CAC21] Bert Bos; Tantek elik; Ian Hickson; H̝on Wium Lie. "Characters and case", section 4.1.3 of CSS level 2.1 specification. W3C Working Draft. 15 September 2003. <http://www.w3.org/TR/2003/WD-CSS21-20030915/syndata.html#q6> [SYN3] L. David Baron, editor. "CSS style sheet representation", section 3 of CSS3 syntax module. W3C Working Draft. 13 August 2003. <http://www.w3.org/TR/2003/WD-css3-syntax-20030813/#css-style>. [SYN1] H̝on Wium Lie;Bert Bos. "CSS1 grammar", Appendix B of revised CSS1 specification. W3C Recommendation. 11 January 1999. <http://www.w3.org/TR/REC-CSS1#appendix-b>.
Received on Tuesday, 2 December 2003 09:53:59 UTC