- From: Jungshik Shin <jshin@i18nl10n.com>
- Date: Tue, 24 Feb 2004 05:46:27 +0900 (KST)
- To: www-style@w3.org
On Mon, 23 Feb 2004, Chris Lilley wrote: > On Monday, February 23, 2004, 6:32:06 PM, Boris wrote: > > > >> people how aren't clued about character encodings are more likely to > >> serve style sheets that work if treated as windows-1252 than to serve > >> UTF-8. > > BZ> Only in Western Europe. > > Only in those parts of Western Europe that don't speak Greek or > Turkish and don't use Macs. I didn't know 'Western Europe' is that large ;-) > >> Also, for HTML browsers tend to default to windows-1252 regardless of the > >> specs. > > BZ> What gave you this idea? Again, only in Western Europe, even if true (which I > BZ> do not believe it is). > > I gather thatsome browsers treat 8859-1 as CP-1252 to catch the pages > wich are actually CP-1252 but mislabelled as 8859-1. No, Boris wasn't talking about that (ISO-8859-1 vs Windows-1252). He meant that Japanese users set the default encoding in their browser to Shift_JIS or EUC-JP, Korean users set that to EUC-KR, Greek users set it to ISO-8859-7 (or its Windows codepage extension), etc. > >> Using this heuristic also in case 3 instead of looking at the linking > >> document would improve the cacheability of parsed style sheets with > >> negligible actual breakage. > > BZ> Using this instead of looking at the linking document will break > BZ> Japanese pages that use Shift_JIS and Japanese classnames and > BZ> don't specify the encoding (lots and lots of those). In fact, such > BZ> pages were the reason Mozilla added the "look at the linking > BZ> document" thing, if I recall correctly.... > > Interesting. Of course, HTML browsers for Japanese speakers are set to > autodetect among the few encodings used by Japanese language material > (so they get, for example, 8859-1 pages all wrong) because the HTML Well, some browsers have 'universal' encoding detector in addition to langauge/script-specific encoding detectors. > files are typically served without any encoding information, too. I wonder how typical is typical. Do you have any hard number? I don't think it's that bad. I usually turn off the auto-detection with the default encoding set to EUC-KR. I rarely have to override the encoding manually. Well, my web browsing is mostly limited to English and Korean... > So the CSS file gets set based on the encoding of a document, which > was set by sniffing the byte stream and looking for characteristic > patterns and byte frequencies. Auto-detection is just one of several methods by which the encoding of a document can be determined. Users can manually set the encoding of a linking document and the result is propagated to linked-in documents. > It would also be nice if the algorithm for XML and the algorithm for > CSS were identical except for s/encoding declaration/@charset/g > > http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-guessing I can't agree with you more on this. Jungshik
Received on Monday, 23 February 2004 15:46:29 UTC