- From: Frank Yung-Fong Tang <franktang@gmail.com>
- Date: Wed, 12 Oct 2005 17:57:31 -0700
- To: Deborah Cawkwell <deborah.cawkwell@bbc.co.uk>
- Cc: franktang@gmail.com, Richard Ishida <ishida@w3.org>, www-international@w3.org, member-i18n-geo@w3.org
- Message-ID: <2e4dfd690510121757n763f5b9er@mail.gmail.com>
The issue in CSS CSS1 http://www.w3.org/TR/CSS1 " The following is the tokenizer, written in flex [16]<http://www.w3.org/TR/CSS1#ref16>notation. Note that this assumes an 8-bit implementation of flex. The tokenizer is case-insensitive (flex command line option -i). unicode \\[0-9a-f]{1,4} " CSS2 http://www.w3.org/TR/CSS21/syndata.html#q6 " Third, backslash escapes allow authors to refer to characters they can't easily put in a document. In this case, the backslash is followed by at most six hexadecimal digits (0..9A..F), which stand for the ISO 10646 ([ISO10646]<http://www.w3.org/TR/CSS21/refs.html#ref-ISO10646>) character with that number, which must not be zero. (It is undefined in CSS 2.1 what happens if a style sheet *does* contain a zero.) If a character in the range [0-9a-fA-F] follows the hexadecimal number, the end of the number needs to be made clear. There are two ways to do that: 1. with a space (or other whitespace character): "\26 B" ("&B"). In this case, user agents should treat a "CR/LF" pair (U+000D/U+000A) as a single whitespace character. 2. by providing exactly 6 hexadecimal digits: "\000026B" ("&B") " and also unicode \\[0-9a-f]{1,6}(\r\n|[ \n\r\t\f])? The CSS \ escaping is tricky because in CSS1 it does not require a ' ' termination but in CSS2 it does (if it is less than 6 digit. So it become very tricky how to write U+4e00 + 'a' + U+0043 + 'b' 1. \4e00 a\43 b and 2. \004e00a\000043b both represent 4 characters U+4e00 + 'a' + U+0043 + 'B' in CSS2 but 1 represent U+4e00 + ' ' + 'a' + U+0043 + ' ' + B in CSS1 and 2 represent U+4e00 + '0' + '0' + 'A' + U+0000 + '4' + '3' + 'b' in CSS1 And in CSS2 what does \04e00a\00043b represent ? also what does \004E00a\43B represent (notice I change the e to E and b to B It is tricky, right? 2005/10/12, Deborah Cawkwell <deborah.cawkwell@bbc.co.uk>: > > > Hi Frank > > Could you clarify: we're not sure what problem you refer to. Possibly: > > - if you change encoding of your HTML, you should ensure no knock ons with > other files > or > - class defined in another language > or > - something else? > > Many thanks > > Deborah > > > -----Original Message----- > From: www-international-request@w3.org on behalf of Frank Yung-Fong Tang > Sent: Tue 8/23/2005 20:02 > To: Richard Ishida > Cc: www-international@w3.org > Subject: Re: New article for REVIEW: Upgrading from language-specific > legacy encoding to Unicode encoding > > > I think you should mention not only charset with HTML, but also issue > with CSS and seperate JavaScript file. The issue with \ unicode in CSS > is quite tricky. > > Richard Ishida wrote on 8/23/2005, 1:45 PM: > > > > > > > > > Title: Upgrading from language-specific legacy encoding to Unicode > > encoding > > http://www.w3.org/International/questions/qa-utf8-upgrade.html > > > > Comments are being sought on this article prior to final release. > > Please send any comments to www-international@w3.org. We expect to > > publish a final version in one to three weeks. > > > > This article provides an answer to the question: What should I > > consider when upgrading my web pages from legacy encoding to Unicode > > encoding? > > > > > > > > ============ > > Richard Ishida > > W3C > > > > contact info: > > http://www.w3.org/People/Ishida/ > > > > W3C Internationalization: > > http://www.w3.org/International/ > > > > Publication blog: > > http://people.w3.org/rishida/blog/ > > > > > > > > > > > > > http://www.bbc.co.uk/ > > This e-mail (and any attachments) is confidential and may contain > personal views which are not the views of the BBC unless specifically > stated. > If you have received it in error, please delete it from your system. > Do not use, copy or disclose the information in any way nor act in > reliance on it and notify the sender immediately. Please note that the > BBC monitors e-mails sent or received. > Further communication will signify your consent to this. > > -- Frank Yung-Fong Tang 譚永鋒 Îñţérñåţîöñåļîžåţîöñ FrankTang@gmail.com Skype: FrankYungFongTang Yahoo IM: FrankYungFongTan MSN IM: FrankYungFongTang@hotmail.com
Received on Thursday, 13 October 2005 00:57:56 UTC