- From: Rick Jelliffe <ricko@allette.com.au>
- Date: Mon, 16 Jun 1997 01:48:06 +1000
- To: <w3c-sgml-wg@w3.org>
I think this I18N issue has four questions, with my suggested answers. Q1) What encodings can an XML 1.0 document have? A1) Anything you (and MIME) like, provided it has the correct encoding PIs or whatever. Q2) What character set must an XML 1.0 document use? A2) Unicode 2.0, with all (future) surrogate characters represented by numeric character references. Surrogate characters cannot be used in markup. Q3) What character format should an XML 1.0 parser use? A3) An XML 1.0 parser should be able to read all XML 1.0 documents. This means that it must have 16-bit or more characters (at least in its reading and parsing routines). If we don't do this then there will be some XML documents that cannot be read by some XML parsers, which I think is a bit slack. (An 8-bit processor might still work, but it should be considered a partial XML processor, not a real or full one.) Q4) What character format should the parser use for character passing to the application? A4) None of our (XML WG's) business. It can be 8-bit, 22-bit, structs, anything. Rick Jelliffe
Received on Sunday, 15 June 1997 11:47:34 UTC