- From: Richard Ishida <ishida@w3.org>
- Date: Wed, 20 Jul 2011 14:05:28 +0100
- To: "public-i18n-core@w3.org" <public-i18n-core@w3.org>
[1] 3. Specifying a Document's Character Encoding http://www.w3.org/TR/html-polyglot/#character-encoding "By using the Byte Order Mark (BOM) character (preferred)." We need to decide whether the UTF-8 signature is still a problem. (I've been working on a new version of the article about the BOM recently where some rehabilitation may be in order, except that it seems to me that there are still the following issues associated with using the utf-8 BOM: a. a bom at the start of a PHP file can corrupt non-ascii characters, and produce blank lines b. it produces quirks mode in IE6 c. it overrides HTTP encoding declarations in some browsers - which can be problematic in the case of server-based transcoding d. dreamweaver doesn't seem to save with/without the bom properly I'm struggling to produce test files at the moment... [2] 6.3.3 Attribute Values http://www.w3.org/TR/html-polyglot/#attribute-values "Polyglot markup maintains case consistency for values on the following attributes, which occur on MIME types, language tags, charsets, booleans, media queries, and keywords. Though not required, an easy way to maintain case-consistency is to use only lower case values for these attributes. Polyglot markup maintains case consistency for these values because, for the purpose of selector matching, attribute values in XML are all treated case sensitively; however, HTML treats the values of these attributes as case insensitive (See 4.14.1 Case-sensitivity, in the HTML5 specification). [HTML5] " "... lang ..." It seems to me that lang should not be in this list. XML processors don't recognise lang as containing language information - which is why you have to have xml:lang anyway (specified elsewhere in this spec). So any case sensitivity would be relevant to xml:lang. Unless I'm mistaken, the CSS3 Selectors spec says that language attributes, including xml:lang are matched in a case-insensitive way (http://www.w3.org/TR/css3-selectors/#lang-pseudo), so xml:lang shouldn't be in this list either (currently it's not). [3] 7.2 Language Attributes http://www.w3.org/TR/html-polyglot/#language-attributes "For the mechanism to actually set a fallback language, however, it has to locate either an http-equiv="Content-Language" declaration on the meta element or an HTTP Content-Language: header, either of whose content value is no more and no less than exactly one language tag. Note that although the mechanism can locate either the meta element or the header, the meta element is considered first." Content-Language meta is now non-conforming in HTML5. I think this has two implications for the polyglot spec: 1. the spec should clearly state that "Polyglot markup does not use the meta element with an http-equiv attribute in the Content Language state." 2. since the polyglot spec already requires the lang+xml:lang attributes if an http header or meta uses Content-Language with a single language value (to override the value), the whole of the paragraph containing the text quoted above is (interesting but) irrelevant. I think the paragraph should be dropped. [4] 11. Exceptions from the Foreign Content Parsing Rules http://www.w3.org/TR/html-polyglot/#foreign-content Is this section intentionally blank? -- Richard Ishida Internationalization Activity Lead W3C (World Wide Web Consortium) http://www.w3.org/International/ http://rishida.net/ Register for the W3C MultilingualWeb Workshop! Limerick, 21-22 September 2011 http://multilingualweb.eu/register
Received on Wednesday, 20 July 2011 13:06:05 UTC