Re: I18N issue needs consideration from Steve Byrne on 1997-06-13 (w3c-sgml-wg@w3.org from June 1997)

From: Steve Byrne <sbb@Eng.Sun.COM>
Date: Fri, 13 Jun 1997 15:16:48 -0700
To: w3c-sgml-wg@w3.org
Message-Id: <199706132216.PAA18556@javinator.eng.sun.com>

Gavin Nicol writes:
 > 
 > And as you know, I suggested we just have the DOM to say "string",
 > and nothing more. This is perfectly reasonable, I beleive.
 > 

Gavin,

This is the second time I've seen a message where you've made a statement like
this.  I'm curious about what your thinking is.  It would seem to me that to
meet the stated goals of the DOM, i.e. to have consistent, portable scripts
manipulating documents, and in particular text and maybe even attribute values,
that you would have to be a little more concrete than that.

For example, if a script is iterating or counting the characters in a text
object that was retrieved from the DOM, doesn't the result depend on the
encoding of the characters in the text object as presented by the DOM (which
may be different from their representation internally)?  If the DOM doesn't
specify a more specific encoding, doesn't it open the way for one
implementation to say that it uses UTF-8 encoding for text content returned
from the DOM, and another say that it uses Unicode code points, and a third
DOM implementation to have its strings composed of 31 bit characters?  Won't
the scripts executing on the different implementations have radically different
behavior?

Can you help me to understand why you don't think this is a problem, i.e. how
to finesse away this concern?

Steve

Received on Friday, 13 June 1997 21:45:13 UTC