W3C home > Mailing lists > Public > www-html@w3.org > June 2006

RE: Problem in publishing multilingual HTML document on web in UTF-8 encoding

From: Paul Nelson \(ATC\) <paulnel@winse.microsoft.com>
Date: Thu, 1 Jun 2006 20:04:09 -0700
Message-ID: <49C257E2C13F584790B2E302E021B6F9100C79AF@winse-msg-01.segroup.winse.corp.microsoft.com>
To: "Kelly Miller" <lightsolphoenix@gmail.com>, "David Woolley" <david@djwhome.demon.co.uk>
Cc: <www-html@w3.org>

I find this discussion very interesting.

Firstly, I know that IE has very good international text support...and
has been ahead of most browsers in this area for a number of years.

Second, I know that we have autodetection for codepage of a
document...just in case the user never set that in the page. The
autodetection has worked well for a number of years.

I suspect that he issue has less to do with publishing multilingual HTML
documents on the web in UTF-8 than the infrastructure that is being used
to achieve the task. I am aware of many companies that publish
multilingual sites in UTF-8 that work fine with IE.

Regards,

Paul Nelson
IE Text (Beijing)



-----Original Message-----
From: www-html-request@w3.org [mailto:www-html-request@w3.org] On Behalf
Of Kelly Miller
Sent: Friday, June 02, 2006 10:41 AM
To: David Woolley
Cc: www-html@w3.org
Subject: Re: Problem in publishing multilingual HTML document on web in
UTF-8 encoding


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Woolley wrote:
|
| However, in this case, you aren't using HTML, but XHTML.  In my view, 
| it is almost certain that you are doing so for unsound reasons, but 
| there are rules for the character set in XML and in fact the default 
| is already UTF-8!  However, it is likely that you are actually serving

| to Internet Explorer, which doesn't support XHTML, so you've had to 
| serve it with headers that say that it is HTML.  In fact, your meta 
| element also says that it is HTML.  You therefore have a confused 
| situation where you are relying on browser error recovery to treat a 
| document written in XHTML as though it were broken HTML.  I'd suggest 
| the first thing to do is to convert to XHTML 4.01 to eliminate the 
| error recovery aspects.

Once again, lack of support in the majority browser hampers adoption of
XHTML over HTML; why would someone want to use XHTML if they have to
treat it as HTML anyway?

As long as IE doesn't support application/xhtml+xml, XHTML will run into
this brick wall, with no way around it (short of IE being fixed or
people changing their browsers)...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFEf6U/vCLXx0V8XHQRAnw3AJ9M2OeRABmkJ4IOwR7yfmoDbdsNzACcDR0f
dso3e0zN0lre2hwC7FCCXVg=
=tHfG
-----END PGP SIGNATURE-----
Received on Friday, 2 June 2006 03:03:53 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:16:06 GMT