Re: Encoding Standard (was: RE: Encoding API exceptions) from Anne van Kesteren on 2014-11-10 (www-international@w3.org from October to December 2014)

From: Anne van Kesteren <annevk@annevk.nl>
Date: Mon, 10 Nov 2014 10:26:16 +0100
To: Shawn Steele <Shawn.Steele@microsoft.com>
Cc: "www-international@w3.org" <www-international@w3.org>
Message-ID: <CADnb78io2+U7_WnUw90=o_zMShSUmQBTH0w=vutmqwpLVvERrw@mail.gmail.com>

On Sun, Nov 9, 2014 at 9:44 PM, Shawn Steele <Shawn.Steele@microsoft.com> wrote:
> Generally the content is created with text editors, from data stores, etc, that came from other systems, and not specifically for the web.

I.e. mostly Windows, though some IBM and NEC, and of course gb18030
(except for one double byte sequence as indicated). Turns out that
browsers on e.g. Mac and Linux felt the pressure to not just support
encodings from the host OS, but also from Windows. And then over time
some cleanup happened and the Encoding Standard is the result of what
we think is needed to support the web.

>  I'm unaware of systems that convert from shift-jis to shift-jis for example.

I'm not sure I follow this example.

> In other words, if the definitions are incompatible with the behavior on the host OS (or wherever the data comes from), then there're likely to be corruptions.

On the web, the data can come from anywhere. The host OS is not
relevant as that can change over time.

> The solution is, of course, to use Unicode.

Quite.

-- 
https://annevankesteren.nl/

Received on Monday, 10 November 2014 09:26:42 UTC