Re: questions on XML sgml decl's charsets
>If all XML uses the same SGML declaration, and that declaration specifies
>ISO 10646, then all XML machines must be ISO 10646 machines. Or are
>we planning to allow different XML documents to have different SGML
The character repertiore is fixed, though we are allowing different
coded encoding as input, which are to be translated using the
appropriate BCTF/decoder into a set of bit combinations.
>Note that specifying Unicode as the "document character set" (1) specifies
>what character numbers in numerical character references are to be
>interpreted as what characters, and (2) specifies what characters (however
>represented in the system character representation) are legal in the
>The interesting question is: How do you deal with a legal SGML character
>that your system has no internal representation for? I haven't thought
>that one through. In some cases, I suspect your entity manager could
>convert the character as stored in the storage representation into a
>numeric character reference. But that can break down if the character
>is used in markup. Hmmm.... ;-)
SGML is silent on this issue, I believe (we went through this for
HTML, and I invoked this silence as a way for saying older browsers
would be conformant with the HTML I18N draft).
I think we could make this a reportable error (not a fatal error).
A more interesting question is what to do with characters that do not
appear within the character repertoire we have defined...