W3C home > Mailing lists > Public > www-international@w3.org > January to March 2000

RE: Localization of XML

From: Nir Dagan <nir@nirdagan.com>
Date: Tue, 08 Feb 2000 15:03:26 -0500
Message-Id: <200002082000.PAA04482@vega.brown.edu>
To: "Langer, Paul" <Paul.Langer@softwareag.com>, "'www-international@w3.org'" <www-international@w3.org>
The document served by http://www.w3.org/ is encoded both 
in iso-8859-1 and UTF-8. This is because it is encoded 
in US-ASCII.

XML requires an processing instruction about the encoding
whenever the encoding is not UTF-8. But since the URL 
in question is encoded in UTF-8 there is no problem.
This is true even if a charset parameter is provided 
by an HTTP header. Thus, the explanation below is wrong.

Also the charset parameter in the Content-type header 
is correct. Serving the document by HTTP with a statement of 
US-ASCII or UTF-8 may confuse some browsers. So serving it with 
iso-8859-1 is both correct and bugward compatible. 

From: erik@netscape.com [mailto:erik@netscape.com]
Sent: Tuesday, February 08, 2000 4:07 AM
Subject: Re: Localization of XML

> [snip]
> I noticed that the following w3.org page uses XHTML:
>
>   http://www.w3.org/
>
> However, it doesn't start with the characters "<?xm" even though the
> charset is iso-8859-1...

This page is (currently) send with following media type:
  
  Content-Type: text/html; charset=iso-8859-1

So there is no need for an XML declaration. (Even if the
media type would be switched to text/xml, the charset parameter
of the Content-Type header would remain authoritative.)

===================================
Nir Dagan
Assistant Professor of Economics
Brown University 
Providence, RI
USA

http://www.nirdagan.com
mailto:nir@nirdagan.com
tel:+1-401-863-2145
Received on Tuesday, 8 February 2000 15:01:05 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:55 GMT