Re: Encoding: Referring people to a list of labels from Andrew Cunningham on 2014-01-28 (www-international@w3.org from January to March 2014)

From: Andrew Cunningham <lang.support@gmail.com>
Date: Tue, 28 Jan 2014 20:52:28 +1100
To: Henri Sivonen <hsivonen@hsivonen.fi>
Cc: WWW International <www-international@w3.org>, Asmus Freytag <asmusf@ix.netcom.com>
Message-ID: <CAGJ7U-VxopW7Lerj40WP_JupO8NbRSeDX7L+0BQ4CnEVUd+oSw@mail.gmail.com>

Henri,

Your point is valid,  but there is the flip side:  the specs should also
require browser developers to implement correct rendering of those
codepoints.

In those cases where there is full Unicode support for a laanguage.

Passing off the problem as a need to provide guidance to web developers to
Unicode ignores the fact that browser developers also need to get it right.

You need a complete ecosystem,  browsers,  web services and infrastructure
and content.

And honestly its the web browsers that are the weak link in the utf-8
cycle.

A.
On 28/01/2014 8:42 PM, "Henri Sivonen" <hsivonen@hsivonen.fi> wrote:

> On Mon, Jan 27, 2014 at 7:44 PM, Asmus Freytag <asmusf@ix.netcom.com>
> wrote:
> > On 1/27/2014 3:20 AM, Henri Sivonen wrote:
> > Font and keyboard
> > bindings to PUA are in principle possible, but in practice the "fake
> > Latin-1" is, well, more "practical".
> ...
> > If we just accept that this is a usage pattern that just "is" and won't
> go
> > away and that therefore 8-bit encodings exist (whether correctly
> declared or
> > not), then we could get back to the topic of what level of support and
> > documentation for non-UTF-8 is appropriate.
>
> I covered this in my first post to this thread, but others wanted to
> focus on an associated remark being "harsh". If you are doing a
> Latin-1 hijack anyway, you should use the label windows-1252  instead
> of trying to be cute with x-user-defined, which works differently in
> different contexts, or some other label that browsers don't even
> recognize and that depends on the browser falling back to windows-1252
> for bogus labels, which isn't true for all browser configurations
> (localizations).
>
> As for whether opining against Latin-1 hijacks is "harsh", sure you
> can always come up with an example of something rare that is not
> covered by Unicode, but at least my recent encounters of this pattern
> have involved scripts that have been in Unicode since Unicode 1.0. I
> think this group should, in the advise it gives to authors, first and
> foremost tell writers of those scripts that are in Unicode to use the
> proper Unicode code points in UTF-8 labeled as UTF-8. We are way past
> point where it make sense to devote a lot of advice to practicing the
> Unicode avoidance tricks of the past.
>
> --
> Henri Sivonen
> hsivonen@hsivonen.fi
> https://hsivonen.fi/
>

Received on Tuesday, 28 January 2014 09:52:55 UTC