[Prev][Next][Index][Thread]

Re: XML character sets: a proposal



[Tim writes]
>Whichever way we go on this, here's a question.  Suppose we have a mechanism
>for flagging the encoding of some text, and the top-level entity of some
>XML doc has been thus flagged as being in encoding X.  Is it then a 
>requirement that all external entities referenced from the top level also
>be in encoding X, or do we need to re-evaluate, for example by
>checking for James' proposed 0xfffe? 

I have been through this so many times in various forums...

If you have a document entity in say SJIS, I can see no good reason
at all for requiring that all external entities also be in shift-jis.
For example, say I have a document entity on an NT machine in JAPAN.
It uses SHIFT-JIS. It includes an entity from CHINA in BIG5, and another
entity from Thailand in TIS, and another from Europe in ISO-8859-1.

Provided we have a single document character set, and some way of
indicating the encoding (preferrably via the MIME type, or via FSI
attributes), then there is no problem at all, and in fact, there
is a great deal of benefit in such a system.