- From: David Brownell <david-b@pacbell.net>
- Date: Mon, 18 Feb 2002 13:52:43 -0800
- To: "Allen, Michael B (RSCH)" <Michael_B_Allen@ml.com>
- Cc: www-dom@w3c.org
> Sure, but I don't think you have to define what the DOMString *character > encoding* is. DOMString could just be the standard string type for that > language. In C this would be a pointer to 'char' (The encoding of the string > object this pointer points to is the locale dependent character enocoding such > as ISO-8859-5 or UTF-8 but my point is this shouldn't matter. But it _does_ matter whether the representation supports all XML characters. ァ is not representable in 8859-5, but it is representable in XML -- and in UTF-8 or UTF-16. If you're saying that different environments have different ways to handle wide such variability (like <wchar.h> etc in C), sure; but if you're saying that it's OK to assume a single restrictive locale and encoding, you've got a problem on your hands. In XML processing you can't make the simplifying "only strings in this system's locale will ever appear" assumption. > Making it UTF-16 (big endian, little endian, w/wo BOM?) unnecessarily > constrains the implementation. I know first hand it creates a significant barrier > for C. It requires that the implementation provide all the usual string > manipulation functions. Well, yes. Not that I do much C++ hacking at any more, but in what sense could an API be portable if no code using it could be portable? And if it couldn't actually represent ALL the data that has to go through the API? Seems to me the barrier you're talking about is a widely recognized gap in older C/C++ environments: poor I18N support. That gap is one reason that Java caught on so well for XML. - Dave
Received on Monday, 18 February 2002 16:54:27 UTC