Re: i18n reviews of DOM 3 Core and Load&Save from Johnny Stenback on 2003-09-19 (www-dom@w3.org from July to September 2003)

From: Johnny Stenback <jst@w3c.jstenback.com>
Date: Fri, 19 Sep 2003 12:27:09 -0700
To: Elliotte Rusty Harold <elharo@metalab.unc.edu>
Cc: Joseph Kesselman <keshlam@us.ibm.com>, Francois Yergeau <FYergeau@alis.com>, "'www-dom@w3.org'" <www-dom@w3.org>, www-dom-request@w3.org
Message-ID: <3F6B588D.9040708@w3c.jstenback.com>

Elliotte Rusty Harold wrote:

> At 9:06 AM -0400 9/18/03, Joseph Kesselman wrote:
> 
> 
>> Since parsers are required to accept all three (UTF8 and both byte-orders
>> of UTF16, with appropriate byte-order mark), generating any of the 
>> three as
>> the default output encoding should result in a document that all parsers
>> will accept.
> 
> 
> I'm thinking of Java code, (or C++, or Perl) but not XML. If I tell the 
> serializer to generate UTF-8 I don't want it to work in one 
> implementation but fail in another that only supports UTF-16.
> 

I'm thinking of a closed system where you know that you'll never get 
anything other than UTF-8 (or whichever one you pick). In such cases I 
don't want to *force* code bloat on the implementation just to be able 
to claim compliance. Whether implementations support all of the above is 
IMO simply a matter of quality of implementation. I would imagine that 
most implementations where this matters would simply use a unicode 
library of some sort that can do all these conversion, plus who knows 
what others, and in such cases they would all work. That's what we all 
want in most cases, but I don't want to make the spec *require* that in 
all cases.

-- 
jst

Received on Friday, 19 September 2003 15:27:44 UTC