Re: allow UTF-16 not just UTF-8 (PR#6774)

Jim:

So let me understand this....

Because people have poorly designed and written XML applications running on
3 GHz Pentium 4s with 512 megabytes of real memory that do not allow the
control over whether UTF-8 or UTF-16 are emitted, we are expecting to
burden $49 printers with code to be able to detect and interpret both.

I maintain my objection and my no vote.

**********************************************
 Don Wright                 don@lexmark.com

 Chair,  IEEE SA Standards Board
 Member, IEEE-ISTO Board of Directors
 f.wright@ieee.org / f.wright@computer.org

 Director, Alliances & Standards
 Lexmark International
 740 New Circle Rd
 Lexington, Ky 40550
 859-825-4808 (phone) 603-963-8352 (fax)
**********************************************






"BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com> on 10/08/2003 10:24:45 AM

To:    don@lexmark.com
cc:    elliott.bradshaw@zoran.com, www-html@w3.org
Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)


From
http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6774;user=g

uest  - reply #3

Date: Wed Oct  1 12:43:54 2003

Don and Elliott,

The HTML working group discussed my question of why and XHTML-Print
processor
must be a conforming XML processor (in particular, why it must support both
UTF-8 and UTF-16 encodings) on October 1, 2003.

The answer is that XHTML-Print must be a conforming XML processor and
support
both UTF-8 and UTF-16 encodings to preserve compatibility between xml-based
applications.

If XHTML-Print processors only supported UTF-8 then an xml-based
application
could not be reliably depended upon to emit an XHTML-Print document that
the
XHTML-print application could process.  For example, an xml-based Xforms
application's output of an XHTML-Print document cannot be restricted by the
XHTML-Print specification to UTF-8 since the application may not be able to
control the encoding.

Section 4.3.3 [1] and Appendix F [2] of the XML specification [3] give
heuristics for determing a document's encoding when the charset parameter
of
the MIME type [4] is absent.

An example UTF-16 decoder is available at [5] other encodings are at [6].

Jim Bigelow

[1] http://www.w3.org/TR/REC-xml#charencoding
[2] http://www.w3.org/TR/REC-xml#sec-guessing
[3] http://www.w3.org/TR/REC-xml
[4] http://www.ietf.org/rfc/rfc3023.txt
[5] http://interscript.sourceforge.net/interscript/doc/en_iscr_0282.html
[6] http://interscript.sourceforge.net/interscript/doc/en_iscr_0275.html

Jim
 http://oz.boi.hp.com/~jhb/

Received on Wednesday, 8 October 2003 13:01:27 UTC