W3C home > Mailing lists > Public > www-html@w3.org > October 2003

Re: allow UTF-16 not just UTF-8 (PR#6774)

From: <don@lexmark.com>
Date: Wed, 8 Oct 2003 12:41:51 -0400
To: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
Cc: elliott.bradshaw@zoran.com, www-html@w3.org
Message-ID: <OF636CC156.A2AE6C16-ON85256DB9.005B67D7@lexmark.com>


So let me understand this....

Because people have poorly designed and written XML applications running on
3 GHz Pentium 4s with 512 megabytes of real memory that do not allow the
control over whether UTF-8 or UTF-16 are emitted, we are expecting to
burden $49 printers with code to be able to detect and interpret both.

I maintain my objection and my no vote.

 Don Wright                 don@lexmark.com

 Chair,  IEEE SA Standards Board
 Member, IEEE-ISTO Board of Directors
 f.wright@ieee.org / f.wright@computer.org

 Director, Alliances & Standards
 Lexmark International
 740 New Circle Rd
 Lexington, Ky 40550
 859-825-4808 (phone) 603-963-8352 (fax)

"BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com> on 10/08/2003 10:24:45 AM

To:    don@lexmark.com
cc:    elliott.bradshaw@zoran.com, www-html@w3.org
Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)


uest  - reply #3

Date: Wed Oct  1 12:43:54 2003

Don and Elliott,

The HTML working group discussed my question of why and XHTML-Print
must be a conforming XML processor (in particular, why it must support both
UTF-8 and UTF-16 encodings) on October 1, 2003.

The answer is that XHTML-Print must be a conforming XML processor and
both UTF-8 and UTF-16 encodings to preserve compatibility between xml-based

If XHTML-Print processors only supported UTF-8 then an xml-based
could not be reliably depended upon to emit an XHTML-Print document that
XHTML-print application could process.  For example, an xml-based Xforms
application's output of an XHTML-Print document cannot be restricted by the
XHTML-Print specification to UTF-8 since the application may not be able to
control the encoding.

Section 4.3.3 [1] and Appendix F [2] of the XML specification [3] give
heuristics for determing a document's encoding when the charset parameter
the MIME type [4] is absent.

An example UTF-16 decoder is available at [5] other encodings are at [6].

Jim Bigelow

[1] http://www.w3.org/TR/REC-xml#charencoding
[2] http://www.w3.org/TR/REC-xml#sec-guessing
[3] http://www.w3.org/TR/REC-xml
[4] http://www.ietf.org/rfc/rfc3023.txt
[5] http://interscript.sourceforge.net/interscript/doc/en_iscr_0282.html
[6] http://interscript.sourceforge.net/interscript/doc/en_iscr_0275.html

Received on Wednesday, 8 October 2003 13:01:27 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:06:05 UTC