RE: [Encoding] false statement from Larry Masinter on 2014-07-01 (www-international@w3.org from July to September 2014)

From: Larry Masinter <masinter@adobe.com>
Date: Tue, 1 Jul 2014 17:59:33 +0000
To: Joshua Bell <jsbell@google.com>, John Cowan <cowan@mercury.ccil.org>
CC: Anne van Kesteren <annevk@annevk.nl>, Mark Davis ☕️ <mark@macchiato.com>, Asmus Freytag <asmusf@ix.netcom.com>, "www-international@w3.org" <www-international@w3.org>
Message-ID: <029f7d09c42a4f3e8da95bb9e6167624@BL2PR02MB307.namprd02.prod.outlook.com>

If you scope the override to the web, you need to address workflows of interoperability between web and non-web (web-based email and instant messaging clients, for example), where the non-web application really uses the IANA-registered values. I don’t think that’s a world we want to aim for. One Web, One Internet.

I think it’s better to supplant the IANA charset registry by providing something better – better for all.

I don’t think it’s really a feature to turn off the ability to register new charsets completely, even if it is rare and of limited applicability.  (separate message).

The information in this specification should be merged into the IANA charset registry and presented in a form that is at least as useful as this spec, and also at least as useful as the current registry. (A low bar on both counts, we could ask for more.)

Once that integration has been completed (i.e., the IANA charset registry notes all info as conveyed here), then this specification itself will be redundant.

There is some work to be done to modify the IANA charset registry, including getting IETF consensus to make these enhancements, and perhaps even needing changes to IANA’s charter. Substantial work, but something that could be contracted for. Perhaps by W3C, perhaps as part of the transfer of IANA to ICANN.

It may be in the process we will want to revisit the W3C/WHATWG prioritization that makes browser-to-legacy-nonconforming-content interoperability higher priority than browser-to-nonweb-application.

Larry
--
http://larry.masinter.net

From: Joshua Bell [mailto:jsbell@google.com]
Sent: Tuesday, July 01, 2014 10:06 AM
To: John Cowan
Cc: Anne van Kesteren; Mark Davis ☕️; Asmus Freytag; Larry Masinter; www-international@w3.org
Subject: Re: [Encoding] false statement

On Mon, Jun 30, 2014 at 12:02 PM, John Cowan <cowan@mercury.ccil.org<mailto:cowan@mercury.ccil.org>> wrote:
Anne van Kesteren scripsit:

> # Historically many encodings had their names and labels (and sometimes
> # references to specifications) defined in the IANA Character Sets
> # registry.  This specification supplants that registry.

You are unsurprisingly[*] continuing to miss the point.  The issue is not
whether you say "supplants" or "makes obsolete", which are effectively
synonymous, but that you clarify the scope of the claim.  Wider concerns
exist than the behavior of a few Web browsers, and it is inappropriate,
to say the least, to use absolute language more fitted to the laws of
physics when describing what they do or should do.

Along the lines of the clarification Henri makes in https://www.w3.org/Bugs/Public/show_bug.cgi?id=23646#c36 it seems that the spec should be explicit that it describes the use of text encodings for the Web platform. The HTML spec itself uses the phrase "This specification defines a big part of the Web platform..." in the introduction.

How about:

>>>
While encodings have been defined by many diverse standards, implementations of the Web platform (i.e. Web browsers) have not always implemented them in the same way, have not always used the same labels, and often differ in dealing with undefined and former proprietary areas of encodings. This specification attempts to fill those gaps so that new Web platform implementations do not have to reverse engineer encoding implementations of the market leaders and existing implementations can converge.

In particular, this specification defines the encodings, their algorithms to go from bytes to code points and back, and their canonical names and identifying labels for the Web platform. This specification also defines an API to expose part of the encoding algorithms to JavaScript for the Web platform.

Historically encodings and their specifications (if any) were kept track of by the IANA Character Sets registry. This specification supplants the use of that registry for the Web platform.
<<<

That repeats "Web platform" what seems an excessive number of times, but I believe it's important; I have a (poorly maintained) polyfill for the JS API and get frequent requests from non-Web platform users (i.e. the Node.js community) to make changes that are not aligned with the spec and have had to clarify the purpose and scope of the polyfill.

[*] I say it isn't surprising based on a _mot_ of Upton Sinclair's:
"It is difficult to get a man to understand something, when his salary
[or his status] depends upon his not understanding it!"

What, the Internet isn't synonymous with the World Wide Web? What madness is this? :)

--
John Cowan          http://www.ccil.org/~cowan        cowan@ccil.org<mailto:cowan@ccil.org>
Let's face it: software is crap. Feature-laden and bloated, written under
tremendous time-pressure, often by incapable coders, using dangerous
languages and inadequate tools, trying to connect to heaps of broken or
obsolete protocols, implemented equally insufficiently, running on
unpredictable hardware -- we are all more than used to brokenness.
                   --Felix Winkelmann

Received on Tuesday, 1 July 2014 18:00:06 UTC