- From: Allen, Michael B (RSCH) <Michael_B_Allen@ml.com>
- Date: Tue, 19 Feb 2002 18:40:54 -0500
- To: "'Philippe Le Hegaret'" <plh@w3.org>, "WWW DOM" <www-dom@w3.org>
> -----Original Message----- > From: Philippe Le Hegaret [SMTP:plh@w3.org] > Sent: Tuesday, February 19, 2002 6:23 PM > To: WWW DOM > Subject: Re: DOMString Character Encoding > > On Sun, 2002-02-17 at 19:15, Allen, Michael B (RSCH) wrote: > > Specifying the type is one thing, but specifying the encoding is another. > > Making it UTF-16 (big endian, little endian, w/wo BOM?) unnecessarily > > constrains the implementation. I know first hand it creates a significant barrier > > for C. It requires that the implementation provide all the usual string > > manipulation functions. Consider what would happen if the DOMString type > > were defined as a long and specified the encoding as UCS-4BE? What would > > the Java language binding look like? > > see > [[ > Applications must encode DOMString using UTF-16 > ]] > http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/core.html#ID-C74D1578 > > big endian or little endian is platform dependent. I don't think that > the BOM doesn't have anything to do in a DOMString. > Internally I doubt Java Strings have BOMs but if you serialize one they sure do. But that doesn't matter because Java users should never be concerned with the actual character encoding of Java Strings. I'm trying to make the same point about DOMString but I'm not sure anyone has acknowledged they even know what I'm talking about. Or I'm missing something fundamental here. Mike
Received on Tuesday, 19 February 2002 18:40:58 UTC