- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Wed, 04 Feb 2004 05:04:07 +0100
- To: "Richard Ishida" <ishida@w3.org>
- Cc: "GEO" <public-i18n-geo@w3.org>
* Richard Ishida wrote: >I added a first draft of a final section to the tutorial at >http://www.w3.org/International/tutorials/tutorial-char-enc.html >this afternoon. [...] In the case of conflict between multiple encoding declarations, precedence rules apply to determine which declaration wins out. For XHTML and HTML, the precedence is as follows, with 1 being the highest: 1. HTTP Content-Type 2. XML declaration 3. meta charset declaration 4. link charset attribute [...] The XML declaration is just a processing instruction for HTML user agents and gets thus ignored (if you are lucky); also for XHTML documents delivered as text/html user agent behaivour variies (no surprise since the specifications to not deal with it), e.g., the W3C MarkUp Validator reads the <meta> information while the W3C CSS Validator does not. Where is the BOM? HTML 4.01 does not mention the BOM to determine the character encoding of the document, neither does CSS 2.0... If user agents are somehow expected to use the BOM to determine the character encoding of the document, it should be listed here. I think this should be split into three parts, XML (XHTML, SVG, ...), HTML/XHTML (text/html) and CSS as they have different rules and user agent behaivour varies. [...] The escape mechanism for representing characters in CSS is a backslash followed by a hexadecimal number representing the scalar value. Note that these escapes are terminated by a space, rather than a semi-colon. The CSS escape for á is \E1. [...] Or they are not terminated at all (or implicitly), e.g. Bj\F6rn or M\0000F6bel as opposed to M\F6bel (not "Möbel" but M U+F6BE l).
Received on Tuesday, 3 February 2004 23:04:27 UTC