RE: encoding in XHTML

> From: Peter_Constable@sil.org [mailto:Peter_Constable@sil.org]

> I'm curious about this extract from an appendix to the XHTML spec:


> character encoding explicitly must include both the XML 
> declaration an 
> encoding declaration and a meta http-equiv statement (e.g., 
> <meta http-equiv="Content-type" content="text/html; 
> charset=EUC-JP" />).

I think there are two problems.

(1) The use of the word "must" is probably a mistake.  It should
      be "should".  It would be a mistake to require something in a spec
     to work around a bug of one product.

(2) If they indeed intend to be a requirement, I see a danger here.
      First, it is inefficient to carry the identical information in a document.
      Secondly, it incrases a chance of creating a wrong document
      where the first and second methods of encodinf declaration
      declare different encodings.

The quoted statement is emphasizing portability concern of the
document.  They might also want to add a statement that
requires changing the encoding in these embedded tags, if
the received document is in a different encoding than declared
in the embedded tags.  It is possible because HTTP level
declaration of charset is stronger than the embeded information
per HTML and XML spec.

By the way, despite the mandate in XML spec that reads:
	All XML processors must be able to read entities in both 
	the UTF-8 and UTF-16 encodings. 
I recently learned that there are XML parser implementation on
a "limited configuration" device that only recognizes Shift_JIS. Apparantly,
because of the resource limitation, the implementator decided to
ignore the mandate.  Just another example of the reality that doesn't
always match with the ideal world...

-kuro

Received on Sunday, 3 November 2002 14:34:08 UTC