W3C home > Mailing lists > Public > www-html@w3.org > June 2006

RE: Problem in publishing multilingual HTML document on web in UTF-8 encoding

From: Paul Nelson \(ATC\) <paulnel@winse.microsoft.com>
Date: Thu, 1 Jun 2006 20:04:09 -0700
Message-ID: <49C257E2C13F584790B2E302E021B6F9100C79AF@winse-msg-01.segroup.winse.corp.microsoft.com>
To: "Kelly Miller" <lightsolphoenix@gmail.com>, "David Woolley" <david@djwhome.demon.co.uk>
Cc: <www-html@w3.org>

I find this discussion very interesting.

Firstly, I know that IE has very good international text support...and
has been ahead of most browsers in this area for a number of years.

Second, I know that we have autodetection for codepage of a
document...just in case the user never set that in the page. The
autodetection has worked well for a number of years.

I suspect that he issue has less to do with publishing multilingual HTML
documents on the web in UTF-8 than the infrastructure that is being used
to achieve the task. I am aware of many companies that publish
multilingual sites in UTF-8 that work fine with IE.


Paul Nelson
IE Text (Beijing)

-----Original Message-----
From: www-html-request@w3.org [mailto:www-html-request@w3.org] On Behalf
Of Kelly Miller
Sent: Friday, June 02, 2006 10:41 AM
To: David Woolley
Cc: www-html@w3.org
Subject: Re: Problem in publishing multilingual HTML document on web in
UTF-8 encoding

Hash: SHA1

David Woolley wrote:
| However, in this case, you aren't using HTML, but XHTML.  In my view, 
| it is almost certain that you are doing so for unsound reasons, but 
| there are rules for the character set in XML and in fact the default 
| is already UTF-8!  However, it is likely that you are actually serving

| to Internet Explorer, which doesn't support XHTML, so you've had to 
| serve it with headers that say that it is HTML.  In fact, your meta 
| element also says that it is HTML.  You therefore have a confused 
| situation where you are relying on browser error recovery to treat a 
| document written in XHTML as though it were broken HTML.  I'd suggest 
| the first thing to do is to convert to XHTML 4.01 to eliminate the 
| error recovery aspects.

Once again, lack of support in the majority browser hampers adoption of
XHTML over HTML; why would someone want to use XHTML if they have to
treat it as HTML anyway?

As long as IE doesn't support application/xhtml+xml, XHTML will run into
this brick wall, with no way around it (short of IE being fixed or
people changing their browsers)...
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

Received on Friday, 2 June 2006 03:03:53 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:06:13 UTC