Re: i18n Polyglot Markup/NCRs (7th issue) from Henri Sivonen on 2010-07-19 (public-html@w3.org from July 2010)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Mon, 19 Jul 2010 06:35:02 -0700 (PDT)
To: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Cc: public-html <public-html@w3.org>, Eliot Graff <eliotgra@microsoft.com>, public-i18n-core@w3.org, Henri Sivonen <hsivonen@iki.fi>
Message-ID: <1695317410.39923.1279546502310.JavaMail.root@cm-mail03.mozilla.org>

Leif wrote:
> First of all, my comment was to Richard, who suggested that POlyglot
> markup should "favor" hexadecimal NCRs.

I think neither decimal nor hexadecimal can be preferred over the other on polyglot grounds, so the publication shouldn't prefer one over the other.

> A possible answer to your question is found in Sam's messages [1][2].
> He suggest only to allow UTF-8 as encoding of polyglot markup.

That steps outside logical inferences from specs to determine what's polyglot. The logical inferences lead to a conclusion that polyglot documents can be constructed using UTF-8 and using UTF-16.

There are other reasons to prefer UTF-8 over UTF-16, but polyglotness isn't one of them, so the WG shouldn't pretend that it is.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Monday, 19 July 2010 13:35:37 UTC