W3C home > Mailing lists > Public > public-html@w3.org > January 2013

Re: TAG Decision on Rescinding the request to the HTML WG to develop a polyglot guide

From: Henri Sivonen <hsivonen@iki.fi>
Date: Tue, 22 Jan 2013 10:55:53 +0200
Message-ID: <CAJQvAuftxLY-P_RXQwJ_La_Yn-OCdgSxQx7WKeKnF5MEYxLqBw@mail.gmail.com>
To: Daniel Glazman <daniel@glazman.org>
Cc: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>, Maciej Stachowiak <mjs@apple.com>, Sam Ruby <rubys@intertwingly.net>, Noah Mendelsohn <nrm@arcanedomain.com>, "www-tag@w3.org List" <www-tag@w3.org>, "public-html@w3.org" <public-html@w3.org>, Paul Cotton <Paul.Cotton@microsoft.com>, Anne van Kesteren <annevk@annevk.nl>, Aryeh Gregor <ayg@aryeh.name>, Lachlan Hunt <lachlan.hunt@lachy.id.au>, Ms2ger <ms2ger@gmail.com>
On Tue, Jan 22, 2013 at 10:37 AM, Daniel Glazman <daniel@glazman.org> wrote:
> On 22/01/13 09:16, Henri Sivonen wrote:
>
>>> Since an
>>> editor like mine can edit all flavors of html, it still needs to
>>> output the xml declaration for xhtml
>>
>>
>> Not if you always output XML as UTF-8, which you should, since UTF-16
>> makes no sense for interchange and UTF-8 is the only other encoding
>> guaranteed by XML to be supported.
>
>
> This is a joke, right? My editor does all flavors of html, and an xhtml1
> or 1.1 document saved without the xml decl will choke many software
> environments and

No joke. Since the beginning of time, all XML parsers are required to
accept UTF-8 and accept documents without the XML declaration when the
documents are encoded in UTF-8. And they actually do. Therefore, as
far as consumability by XML software goes, there is and never has been
any reason ever to encode XML using any encoding other than UTF-8.

The only reason why you might want an XML serializer output something
other than UTF-8 is if you want to enable editing using a legacy text
editor. However, these days, UTF-8-capable text editors are readily
available, so in practice, there is no good reason for an XML
serializer to output an encoding other than UTF-8.

Please don't support outputting encodings other than UTF-8.

> XML 1.1 accepts all IANA-registered charsets.
>
>   http://www.w3.org/TR/xml11/#NT-EncodingDecl

XML 1.1 flopped. The relevant XML spec is 1.0 4th edition.

Either way, XML processors are required to support UTF-8 and UTF-16.
Support for other encodings is optional. In other words, other
encodings are not guaranteed to work.

> Anyway, this is not the point. The point is making sure an app is able
> to serialize correctly the following:
>
>   - xhtml 1 or 1.1
>   - html5, xml serialization, not poyglot
>   - html5, xml serialization, polyglot
>
> I can make the difference between the first and the two last ones based
> on the doctype and friends. I am unable to make any difference between
> the two last ones.

Don't support polyglot. Problem solved.

That polyglot is eating your development time like this is evidence
that publishing the polyglot guide is not harmless.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Tuesday, 22 January 2013 08:56:23 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:39:36 UTC