W3C home > Mailing lists > Public > www-html@w3.org > October 2003

Re: allow UTF-16 not just UTF-8 (PR#6774)

From: Michael Sweet <mike@easysw.com>
Date: Fri, 17 Oct 2003 08:13:16 -0400
Message-ID: <3F8FDCDC.30406@easysw.com>
To: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
Cc: Steven Pemberton <steven.pemberton@cwi.nl>, don@lexmark.com, w3c-html-wg@w3.org, voyager-issues@mn.aptest.com, elliott.bradshaw@zoran.com, www-html@w3.org

BIGELOW,JIM (HP-Boise,ex1) wrote:
> ...
> If a printer uses 16 bits internally to represent a character, then there
> shouldn't be a difference in buffering requirements between utf-8 and utf-16
> encoded files (see below for a more complete discussion).  However, if a
> printer uses 8 bits per character, then it has restricted itself to only
> handle a subset of possible documents, those with ASCII characters.  This is
 > ...

I suggest there is another alternative - the implementation can
simply convert UTF-16 to UTF-8 as the document is being read, so
contrary to the previous comments there is no additional buffer
memory overhead, merely a small amount of code to convert from
UTF-16 to UTF-8.

Whether the implementation chooses to limit support to "latin"
text or not is another issue, but either way the *internal*
representation can be controlled by the vendor separate from the
external UTF-8/UTF-16/whatever representation.

-- 
______________________________________________________________________
Michael Sweet, Easy Software Products           mike at easysw dot com
Printing Software for UNIX                       http://www.easysw.com
Received on Friday, 17 October 2003 08:21:49 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:15:58 GMT