- From: Ian Jacobs <ij@w3.org>
- Date: Tue, 13 Jan 1998 17:13:36 +0100
- To: David Cary <d.cary@ieee.org>
- CC: www-html-editor@w3.org
David Cary wrote: > > character encoding in HTML 4.0 > > Thanks for all the work you and the others have put into putting HTML 4.0 > together, revised, and put online. > > There's just one little thing that doesn't quite look right to me. > Perhaps I am just misinterpreting something ? > > In section "5.2.2 Specifying the character encoding", > http://www.w3.org/TR/REC-html40/charset.html#h-5.2.2 > it appears the priorities listed are reversed -- > -- it seems to me they should be > > 1. A META declaration with "http-equiv" set to "Content-Type" and a value > set for "charset". > 2. An HTTP "charset" parameter in a "Content-Type" field. > 3. The charset attribute set on an element that designates an external > resource. > > In other words, if a web server gives a header like > Content-Type: text/html; charset=ISO-8859-1 > but the text of the document itself says > <META http-equiv="Content-Type" content="text/html; charset=EUC-JP"> > it seems to me that the author of the document is more likely to know what > the proper encoding is, and therefore the HTML user agent should render this > document in Japanese. > > However, the current HTML 4.0 specification seems to indicate that > the HTML user agent "must" render it in ISO-8859-1. > > Rationale: if a particular document changes languages > (not that this happens very often), > the author of the document is the first to know about it (and is unable to > change anything but (1)), > the system administrator is the next to know about it (and is the only one > capable of changing (2)), > and other authors (elswhere on the web that refer to that document) > are usually the last to know about it(and are the only ones capable of > changing (3)). David, Thank you for your comments. I will look into them. Ian
Received on Tuesday, 13 January 1998 11:14:24 UTC