Re: DOMString Character Encoding from Philippe Le Hegaret on 2002-02-19 (www-dom@w3.org from January to March 2002)

From: Philippe Le Hegaret <plh@w3.org>
Date: 19 Feb 2002 18:23:15 -0500
To: WWW DOM <www-dom@w3.org>
Message-Id: <1014160995.11966.124.camel@jfouffa>

On Sun, 2002-02-17 at 19:15, Allen, Michael B (RSCH) wrote:
> 	Specifying the type is one thing, but specifying the encoding is another.
> 	Making it UTF-16 (big endian, little endian, w/wo BOM?) unnecessarily
> 	constrains the implementation. I know first hand it creates a significant barrier
> 	for C. It requires that the implementation provide all the usual string
> 	manipulation functions. Consider what would happen if the DOMString type
> 	were defined as a long and specified the encoding as UCS-4BE? What would
> 	the Java language binding look like?

see
[[
Applications must encode DOMString using UTF-16
]]
http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/core.html#ID-C74D1578

big endian or little endian is platform dependent. I don't think that
the BOM doesn't have anything to do in a DOMString.

Philippe

Received on Tuesday, 19 February 2002 18:23:15 UTC