- From: Andrew Cunningham <lang.support@gmail.com>
- Date: Tue, 28 Jan 2014 20:52:28 +1100
- To: Henri Sivonen <hsivonen@hsivonen.fi>
- Cc: WWW International <www-international@w3.org>, Asmus Freytag <asmusf@ix.netcom.com>
- Message-ID: <CAGJ7U-VxopW7Lerj40WP_JupO8NbRSeDX7L+0BQ4CnEVUd+oSw@mail.gmail.com>
Henri, Your point is valid, but there is the flip side: the specs should also require browser developers to implement correct rendering of those codepoints. In those cases where there is full Unicode support for a laanguage. Passing off the problem as a need to provide guidance to web developers to Unicode ignores the fact that browser developers also need to get it right. You need a complete ecosystem, browsers, web services and infrastructure and content. And honestly its the web browsers that are the weak link in the utf-8 cycle. A. On 28/01/2014 8:42 PM, "Henri Sivonen" <hsivonen@hsivonen.fi> wrote: > On Mon, Jan 27, 2014 at 7:44 PM, Asmus Freytag <asmusf@ix.netcom.com> > wrote: > > On 1/27/2014 3:20 AM, Henri Sivonen wrote: > > Font and keyboard > > bindings to PUA are in principle possible, but in practice the "fake > > Latin-1" is, well, more "practical". > ... > > If we just accept that this is a usage pattern that just "is" and won't > go > > away and that therefore 8-bit encodings exist (whether correctly > declared or > > not), then we could get back to the topic of what level of support and > > documentation for non-UTF-8 is appropriate. > > I covered this in my first post to this thread, but others wanted to > focus on an associated remark being "harsh". If you are doing a > Latin-1 hijack anyway, you should use the label windows-1252 instead > of trying to be cute with x-user-defined, which works differently in > different contexts, or some other label that browsers don't even > recognize and that depends on the browser falling back to windows-1252 > for bogus labels, which isn't true for all browser configurations > (localizations). > > As for whether opining against Latin-1 hijacks is "harsh", sure you > can always come up with an example of something rare that is not > covered by Unicode, but at least my recent encounters of this pattern > have involved scripts that have been in Unicode since Unicode 1.0. I > think this group should, in the advise it gives to authors, first and > foremost tell writers of those scripts that are in Unicode to use the > proper Unicode code points in UTF-8 labeled as UTF-8. We are way past > point where it make sense to devote a lot of advice to practicing the > Unicode avoidance tricks of the past. > > -- > Henri Sivonen > hsivonen@hsivonen.fi > https://hsivonen.fi/ >
Received on Tuesday, 28 January 2014 09:52:55 UTC