W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > June 1997

Re: I18N issue needs consideration

From: Gavin Nicol <gtn@eps.inso.com>
Date: Thu, 12 Jun 1997 08:24:54 -0400
Message-Id: <199706121224.IAA23775@nathaniel.eps.inso.com>
To: w3c-sgml-wg@w3.org
>>I would favor using ISO 10646 as coded character set to use for the
>>SGML declaration for XML, and to specify that the character
>>*repertiore* available within XML, is that of ISO 10646. I could be
>>convinced to line up with Unicode in this regard.
>
>A character is represented indirectly via a numeric character reference
>using a single numeral per character.  It only makes sense to represent
>high-order 10646 characters via a single long numeral, such as up to
>eight digits hex.

Right. The operative word here is "indirectly". 

>>There is one more issue, and that is the question of how the
>>application represents/interprets characters. I personally like to
>>view characters as a purely abstract object, thereby leaving the
>>widest possible choice of implementation strategies, though this does
>>not seem to be the model favoured by SGML (this *is* the model for
>>HTML).
>
>In fact, this *is* the "new" SGML model.  Personally, I'd like to see
>it made official with the TC, not even waiting for the revision.  As
>you say, it's the model for HTML--which is one reason that the "new"
>SGML model came up for discussion in the first place.  It's highly
>appropriate.

This is good news. I actually proposed/defined the HTML character
model, and one reason for the stance was to allow older browsers to
still be valid within the new model. Also, intuitively, this makes
sense, because a character *is* an abstract object.
Received on Thursday, 12 June 1997 08:25:37 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 10:04:40 EDT