RE: i18n Polyglot Markup/NCRs (7th issue)

(personal response)

> > First of all, my comment was to Richard, who suggested that
> POlyglot
> > markup should "favor" hexadecimal NCRs.
> 
> I think neither decimal nor hexadecimal can be preferred over the
> other on polyglot grounds, so the publication shouldn't prefer one
> over the other.

Polyglot itself must, of course, support both decimal and hex NCRs. The comment was on specific text in the document that used a decimal NCR instead of a hex NCR. It's a editorial comment, but it would be best to make the change, in my opinion. If the W3C just ignores its own advice in writing documents, why would document authors pay attention to it? Note well: our WG's comment is not saying that polyglot should favor one form over the other normatively. Only that the examples should use hex instead of decimal (unless necessary to the example).

> 
> > A possible answer to your question is found in Sam's messages
> [1][2].
> > He suggest only to allow UTF-8 as encoding of polyglot markup.
> 
> That steps outside logical inferences from specs to determine
> what's polyglot. The logical inferences lead to a conclusion that
> polyglot documents can be constructed using UTF-8 and using UTF-16.
> 
> There are other reasons to prefer UTF-8 over UTF-16, but
> polyglotness isn't one of them, so the WG shouldn't pretend that it
> is.
> 

I agree. Polyglot supports both encoding forms and so it really must treat them somewhat equally. The choice of which encoding to use and the reasons to prefer one or the other lie elsewhere. 

Note that having a document use a Unicode encoding or even a particular Unicode encoding form does not eliminate the need for NCR support. Even when a document can be fully encoded without resorting to entities, there are cases in which the document author may want to resort to them anyway (for clarity, for compatibility with tools or data sources, etc.).

Addison

Addison Phillips
Globalization Architect (Lab126)
Chair (W3C I18N, IETF IRI WGs)

Internationalization is not a feature.
It is an architecture.

Received on Monday, 19 July 2010 16:02:19 UTC