RE: Feedback on Polyglot Markup for review from Phillips, Addison on 2010-07-01 (public-i18n-core@w3.org from July to September 2010)

From: Phillips, Addison <addison@lab126.com>
Date: Thu, 1 Jul 2010 12:14:51 -0700
To: Richard Ishida <ishida@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Message-ID: <C7A5719F1E562149BA9171F58BEE2CA4129EB2F6AE@EX-IAD6-B.ant.amazon.com>

I have added it to next week's agenda.

Addison

Addison Phillips
Globalization Architect (Lab126)
Chair (W3C I18N, IETF IRI WGs)

Internationalization is not a feature.
It is an architecture.


> -----Original Message-----
> From: public-i18n-core-request@w3.org [mailto:public-i18n-core-
> request@w3.org] On Behalf Of Richard Ishida
> Sent: Thursday, July 01, 2010 10:44 AM
> To: public-i18n-core@w3.org
> Subject: Feedback on Polyglot Markup for review
> 
> Folks,
> 
> Any comments on my proposed comments on
> http://www.w3.org/TR/2010/WD-html-polyglot-20100624/ ?
> 
> Addison, can we agenda+ this for next week? Objective: approve
> comments so I can submit them.
> 
> RI
> 
> 
> =================================
> 
> Section 3: Character encoding
> 
> [1] "When polyglot markup uses UTF-16, it should include the BOM
> indicating UTF-16LE or UTF-16BE"
> 
> Should -> must
> 
> 
> [2] "In addition, polyglot markup need not include the meta charset
> declaration, because the parser would have to read UTF-16 in order
> to parse it by definition."
> 
> The i18n WG guidelines recommend that you always include a visible
> encoding declaration in your document, since it helps developers,
> testers, or translation production managers who want to visually
> check the encoding of a document. So it's true to say that you
> strictly don't need it, but we would prefer that you do.
> 
> It would be helpful to have a paragraph that says something along
> those lines.
> 
> 
> [3] " Use UTF-8 or UTF-16 with the appropriate BOM. "
> 
> This could be read "use utf-8 with the appropriate BOM or UTF-16
> with the appropriate BOM", but a utf-8 bom (or signature) is not
> strictly necessary, and some would argue that it may cause problems.
> 
> 
> [4] " In short, for correct character encoding, polyglot markup
> must either: "
> 
> The MUST is too strong.  There is no problem with using more than
> one declaration, and in an earlier comment we said that we
> recommend that you have a readable declaration in the source in
> addition to a UTF8/16 encoding.
> 
> I think it is better just to omit the list and it's lead-in
> paragraph  "In short, for correct ...".
> 
> 
> 
> Section 7 Attributes
> 
> [5] No mention is made of the lang and xml:lang attributes.  The
> document should say that both should be used when language
> attributes are used.
> 
> It may also recommend the use of the language attributes in the
> html element to set the default language for the document, and
> mention that the meta Content-Language element has no usefulness at
> all in XML for setting the language of content.
> 
> 
> 
> Section 6.2.2 Attribute names & 6.2.3 Attribute values
> 
> [6] " however, case requirements do not apply to non-ASCII letters
> such as Greek, Cyrillic, or non-ASCII Latin letters. "
> 
> I'm not sure why this is here.  Scripts such as Greek, Cyrillic,
> and Armenian do have case distinctions, and those distinctions are
> significant in XML if you have attribute names or values in those
> scripts.  But I'm not aware of any characters from those scripts
> being used for attribute names or values in HTML. Are the some in
> MathML or SVG?
> 
> 
> Section 8 Named Entity References
> 
> [7] " For example, polyglot markup uses &#160;  instead of &nbsp;.
> "
> 
> We would prefer your example to use the hexadecimal NER &#xA0;
> rather than the decimal.  See http://www.w3.org/TR/2005/REC-

> charmod-20050215/#C048
> 
> 
> =====================================
> 
> 
> 
> ============
> Richard Ishida
> Internationalization Lead
> W3C (World Wide Web Consortium)
> 
> http://www.w3.org/International/

> http://rishida.net/

> 
> 
> 
> 
>

Received on Thursday, 1 July 2010 19:15:24 UTC