- From: <don@lexmark.com>
- Date: Thu, 16 Oct 2003 08:51:43 -0400
- To: "Steven Pemberton" <steven.pemberton@cwi.nl>
- Cc: <don@lexmark.com>, "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>, <w3c-html-wg@w3.org>, <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>, <www-html@w3.org>
Steven: I think your answer proves my point that the XML commmunity did not and does not consider the limitations of low cost, constrained embedded environments when developing XML. You make the assertion that no extra memory is required yet the reality is quite the opposite. Please tell me if I'm wrong, but my understanding of UTF-8 and UTF-16 is that: 1) Every XHTML tag will require twice as many bytes when represented in UTF-16 versus UTF-8 2) Every English XHTML-Print print job will be twice as big encoded with UTF-16 versus UTF-8 3) Every "Latin 1" print job will be larger approaching 2X in size. When you double the data's size, buffers have to double to be able to hold and manipulate an equivalent amount of print stream content. There is real cost and performance costs to be paid to deal with UTF-16 encoding especially when dealing with western character sets. When a device is designed to deal with the far east "characters" there are other penalties to be paid in things like the size of the font load that mitigate the UTF-16 versus UTF-8 encoding issue. ******************************************* Don Wright don@lexmark.com Chair, IEEE SA Standards Board Member, IEEE-ISTO Board of Directors f.wright@ieee.org / f.wright@computer.org Director, Alliances and Standards Lexmark International 740 New Circle Rd C14/082-3 Lexington, Ky 40550 859-825-4808 (phone) 603-963-8352 (fax) ******************************************* "Steven Pemberton" <steven.pemberton@cwi.nl> on 10/15/2003 07:26:24 PM To: <don@lexmark.com> cc: "BIGELOW,JIM \(HP-Boise,ex1\)" <jim.bigelow@hp.com>, <w3c-html-wg@w3.org>, <don@lexmark.com>, <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>, <www-html@w3.org> Subject: Re: allow UTF-16 not just UTF-8 (PR#6774) But support for UTF 16 adds a few dozen bytes of code, and no extra memory requirements. It is simpler than UTF 8! What's the problem? Steven ----- Original Message ----- From: <don@lexmark.com> To: "Steven Pemberton" <Steven.Pemberton@cwi.nl> Cc: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>; <w3c-html-wg@w3.org>; <don@lexmark.com>; <voyager-issues@mn.aptest.com>; <elliott.bradshaw@zoran.com>; <www-html@w3.org> Sent: Thursday, October 16, 2003 12:20 AM Subject: Re: allow UTF-16 not just UTF-8 (PR#6774) > > Steven, et al: > > The real problem is that the entire XML architecture was designed assuming > high end boxes like the 3 GHz Pentium with 512 megabytes of memory. We > have already seen push back in other standards groups that consumer > electronic devices and other smaller, lighter devices cannot afford all the > luxuries demand by an obese XML architecture. Unless the XML community > accepts subsetting, we can't expect the broadest support for XML to happen > at the low end until the price/performance ratios experience another order > or two magnitude improvement. As recently reported in several of the trade > magazines focused on IT professionals, the deployment of XML and Web > Services are have significant negative impacts on the IT infrastructure > especially in the area of bandwidth utilization. This is just another > symptom of the same problem. > > I know I will lose this argument in the W3C but the realities of the > XHTML-Print implementations will blow off UTF-16 as more fat with no > benefit and simply not support it, "interoperable" or not. > > Sorry I'm not pure but practical. > > ******************************************* > Don Wright don@lexmark.com > > Chair, IEEE SA Standards Board > Member, IEEE-ISTO Board of Directors > f.wright@ieee.org / f.wright@computer.org > > Director, Alliances and Standards > Lexmark International > 740 New Circle Rd C14/082-3 > Lexington, Ky 40550 > 859-825-4808 (phone) 603-963-8352 (fax) > ******************************************* > > > > > "Steven Pemberton" <Steven.Pemberton@cwi.nl> on 10/15/2003 09:18:15 AM > > To: "BIGELOW,JIM \(HP-Boise,ex1\)" <jim.bigelow@hp.com>, > <w3c-html-wg@w3.org>, <don@lexmark.com> > cc: <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>, > <www-html@w3.org> > Subject: Re: allow UTF-16 not just UTF-8 (PR#6774) > > > > From: don@lexmark.com [mailto:don@lexmark.com] > > > So let me understand this.... > > > > Because people have poorly designed and written XML applications running > on > > 3 GHz Pentium 4s with 512 megabytes of real memory that do not allow the > > control over whether UTF-8 or UTF-16 are emitted, we are expecting to > burden > > $49 printers with code to be able to detect and interpret both. > > No Don. It is about interoperability and conforming to standards. XML > allows > documents to be encoded in either UTF8 or UTF 16: consumers must accept > both, producers may produce either. An XHTML-Print printer will be just a > consumer of an XML byte-stream at some IP address; we don't want to burden > every program in the world that can produce XML with a switch that says > "this output is going to a poor lowly XHTML Print processor that can't deal > with UTF-16, so please produce UTF-8", especially since UTF 16 is the easy > one to implement, and can only cost a few dozen bytes at best. > > If we changed this, XHTML Print would have to go back to last call, and you > can bet your boots that the XML community would rise up against us, as it > has in the past, and I can tell you we don't want to go there, and we would > have a hundred people registering objections. > > Conforming to XML requirements comes with the territory of being XHTML. The > XML community will not take lightly to us messing with their standards. > > Best wishes, > > Steven Pemberton > > > > > > >
Received on Thursday, 16 October 2003 09:00:08 UTC