- From: Glenn Maynard <glenn@zewt.org>
- Date: Fri, 9 Dec 2011 21:09:51 -0500
- To: Philip Jägenstedt <philipj@opera.com>
- Cc: public-texttracks@w3.org
- Message-ID: <CABirCh8qjwywiLi8pA9qBfETAi8qPTh5p1wY62Nze2pWsyTKyw@mail.gmail.com>
On Fri, Dec 9, 2011 at 7:45 AM, Philip Jägenstedt <philipj@opera.com> wrote: > Firefox apparently defaults to Japanese for UTF-8 CJK when the language >> isn't specified. It doesn't do any complex heuristics, and it doesn't >> depend on the user's locale. This seems like the optimal solution--no >> heuristics (making it more predictable for users), and has none of the >> locale-specific behavior that plague charsets. >> > > That doesn't sound very good at all for unlabeled simplified or > traditional Chinese. If you mean "for existing content without @lang", then that's what I thought, too. I had assumed that legacy content would prevent browsers from doing this, since IE uses the locale by default. But since Firefox has been getting away with this for a long time, it's worth investigating why this is working out for them. If other browsers can follow suit, then it's the ideal solution: it's simple, consistent, and encourages the use of @lang where the default isn't what people want. I'm not aware of any locale specific stuff here, but I think that the > character encoding plays into this somehow, such that content served as GBK > or Big5 is more likely to be considered to be simplified and traditional > Chinese respectively. Right, most browsers do that. I'm only looking at UTF-8 content here, where there's no language hint implicit in the encoding. -- Glenn Maynard
Received on Saturday, 10 December 2011 02:10:23 UTC