- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Fri, 23 Jul 2010 01:33:57 +0300
- To: "Phillips, Addison" <addison@lab126.com>
- Cc: Henri Sivonen <hsivonen@iki.fi>, public-html <public-html@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Phillips, Addison, Mon, 19 Jul 2010 12:01:48 -0400: > (personal response) > >>> First of all, my comment was to Richard, who suggested that >>> POlyglot markup should "favor" hexadecimal NCRs. >> >> I think neither decimal nor hexadecimal can be preferred over the >> other on polyglot grounds, so the publication shouldn't prefer one >> over the other. > > Polyglot itself must, of course, support both decimal and hex NCRs. > The comment was on specific text in the document that used a decimal > NCR instead of a hex NCR. It's a editorial comment, but it would be > best to make the change, in my opinion. If the W3C just ignores its > own advice in writing documents, why would document authors pay > attention to it? Note well: our WG's comment is not saying that > polyglot should favor one form over the other normatively. Only that > the examples should use hex instead of decimal (unless necessary to > the example). It would not be bad to show both a hex and a dec example, would it? Authors have different preferences w.r.t. to NCRs. E.g. I have learned the dec NCR for 'å' long ago. But I have not learned the hex value yet ... I think the text in question is sufficiently general so that both NCR forms should be mentioned. I would suggest this text, were new text is _underlined_ ]] For entities beyond the previous list, _polyglot markup_ uses _numeric_ character references (NCRs). For example, polyglot markup uses _  (or the decimal NCR equivalent  )_ instead of . [[ >>> A possible answer to your question is found in Sam's messages >>> [1][2]. >>> He suggest only to allow UTF-8 as encoding of polyglot markup. >> >> That steps outside logical inferences from specs to determine >> what's polyglot. The logical inferences lead to a conclusion that >> polyglot documents can be constructed using UTF-8 and using UTF-16. >> >> There are other reasons to prefer UTF-8 over UTF-16, but >> polyglotness isn't one of them, so the WG shouldn't pretend that it >> is. > > I agree. Polyglot supports both encoding forms and so it really must > treat them somewhat equally. The choice of which encoding to use and > the reasons to prefer one or the other lie elsewhere. Inferring from HTML5, one must conclude that Polyglot Markup prefers UTF-8 over UTF-16. See my preceding message. […] -- leif halvard silli
Received on Friday, 23 July 2010 14:38:34 UTC