Re: character encoding in HTML 4.0

David Cary wrote:
> 
> character encoding in HTML 4.0
> 
> Thanks for all the work you and the others have put into putting HTML 4.0
> together, revised, and put online.
> 
> There's just one little thing that doesn't quite look right to me.
> Perhaps I am just misinterpreting something ?
> 
> In section "5.2.2 Specifying the character encoding",
>   http://www.w3.org/TR/REC-html40/charset.html#h-5.2.2
> it appears the priorities listed are reversed --
> -- it seems to me they should be
> 
> 1. A META declaration with "http-equiv" set to "Content-Type" and a value
> set for "charset".
> 2. An HTTP "charset" parameter in a "Content-Type" field.
> 3. The charset attribute set on an element that designates an external
> resource.
> 
> In other words, if a web server gives a header like
>   Content-Type: text/html; charset=ISO-8859-1
> but the text of the document itself says
>   <META http-equiv="Content-Type" content="text/html; charset=EUC-JP">
> it seems to me that the author of the document is more likely to know what
> the proper encoding is, and therefore the HTML user agent should render this
> document in Japanese.
> 
> However, the current HTML 4.0 specification seems to indicate that
> the HTML user agent "must" render it in ISO-8859-1.
> 
> Rationale: if a particular document changes languages
> (not that this happens very often),
> the author of the document is the first to know about it (and is unable to
> change anything but (1)),
> the system administrator is the next to know about it (and is the only one
> capable of changing (2)),
> and other authors (elswhere on the web that refer to that document)
> are usually the last to know about it(and are the only ones capable of
> changing (3)).


David,

Thank you for your comments. I will look into them.

Ian

Received on Tuesday, 13 January 1998 11:14:24 UTC