Re: Encoding: Referring people to a list of labels

On Mon, Jan 27, 2014 at 7:44 PM, Asmus Freytag <asmusf@ix.netcom.com> wrote:
> On 1/27/2014 3:20 AM, Henri Sivonen wrote:
> Font and keyboard
> bindings to PUA are in principle possible, but in practice the "fake
> Latin-1" is, well, more "practical".
...
> If we just accept that this is a usage pattern that just "is" and won't go
> away and that therefore 8-bit encodings exist (whether correctly declared or
> not), then we could get back to the topic of what level of support and
> documentation for non-UTF-8 is appropriate.

I covered this in my first post to this thread, but others wanted to
focus on an associated remark being "harsh". If you are doing a
Latin-1 hijack anyway, you should use the label windows-1252  instead
of trying to be cute with x-user-defined, which works differently in
different contexts, or some other label that browsers don't even
recognize and that depends on the browser falling back to windows-1252
for bogus labels, which isn't true for all browser configurations
(localizations).

As for whether opining against Latin-1 hijacks is "harsh", sure you
can always come up with an example of something rare that is not
covered by Unicode, but at least my recent encounters of this pattern
have involved scripts that have been in Unicode since Unicode 1.0. I
think this group should, in the advise it gives to authors, first and
foremost tell writers of those scripts that are in Unicode to use the
proper Unicode code points in UTF-8 labeled as UTF-8. We are way past
point where it make sense to devote a lot of advice to practicing the
Unicode avoidance tricks of the past.

-- 
Henri Sivonen
hsivonen@hsivonen.fi
https://hsivonen.fi/

Received on Tuesday, 28 January 2014 09:42:44 UTC