W3C home > Mailing lists > Public > www-international@w3.org > January to March 2015

Re: Encoding

From: Glenn Adams <glenn@skynav.com>
Date: Tue, 24 Feb 2015 15:06:39 -0700
Message-ID: <CACQ=j+dEK5MtO2EvOM6eDY60wpW5uUiVMiyDJPTbN8dc7NL9Uw@mail.gmail.com>
To: Shawn Steele <Shawn.Steele@microsoft.com>
Cc: "Phillips, Addison" <addison@lab126.com>, "www-international@w3.org" <www-international@w3.org>
On Tue, Feb 24, 2015 at 2:01 PM, Shawn Steele <Shawn.Steele@microsoft.com>

> I'm still struggling with the goals of the encoding work.
> https://encoding.spec.whatwg.org/

IMO, the reason for this specification is that the author had little
knowledge of character encoding, and used the exercise of writing a new
document as a way to acquire that knowledge, and, of course, to rewrite the
world of encodings in his PoV.

I suppose the reason the author would give, however, is that it was
intended to document existing practice or best practice or something in
between. Again, one questions the authority to do something of that sort
from one new to the subject.

That's just my two cents. Do not interpret my comments as an attack on the
author. I have a lot of respect for him. Just not on this subject.

> Everything except UTF-8 is legacy, which is good, and I get a desire to
> quantify the landscape, however I'm not sure what point is served by
> standardizing the tables.
> Either A) Existing content is already correct per an existing standard (in
> which case a link would suffice), or B) Existing content was encoded using
> slightly different tables.
> In the case of existing content, it probably "works" for whomever's using
> it, though there may be interoperability issues.  To correct that data,
> they need to move to UTF-8.  Adding yet another "perfect" mapping table
> only causes further fragmentation as people may attempt to convert to that.
> For example, HKSCS is rolled up to big-5, however historically there have
> been multiple font-hack PUA and real Unicode code point assignments for
> that space.  Which makes it hard to say that one mapping or another is
> "right" for that space.  It likely depends on actual data, how the
> application uses it, and what it's dependencies are.  Worse, I can't even
> reliably detect the quirks of the system where data originated as it may be
> currently hosted on some other platform.
> Currently different vendors/platforms/systems have slightly different
> mappings.  Clearly that isn't desirable, however a "standard" would
> obviously break existing data for at least some of those
> vendors/platforms/systems.
> So, what does the WG expect to happen from this process?
> A) Do they expect users to correct data to the WG standard mappings?
> B) Do they expect applications (or users) to abandon previous behavior to
> the WG standard mappings?
> C) For either of these, what timeframe does the WG expect it to happen in?
> D) Does the WG expect that this problem will be "solved" as a result of
> this work.  (Solved == everything's codified so there is no more confusion?)
> Thanks,
> -Shawn
Received on Tuesday, 24 February 2015 22:07:26 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:41:07 UTC