- From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>
- Date: Mon, 13 Jan 97 18:35:11 CST
- To: W3C SGML Working Group <w3c-sgml-wg@www10.w3.org>
On Mon, 13 Jan 1997 13:12:24 -0500 Paul Grosso said: >Looking at the SGML decl in the appendix of the XML draft, I have >a few questions. ... >The decl therein seems to define a document baseset from 0 to 255 >and a syntax baseset from 646 (isn't that from 0 to 255?). > >Anyway, even though I cannot figure out document from syntax basesets >from charsets from descsets, it doesn't look like that SGML decl >allows characters above 255, so where does unicode come it? Am >I getting confused between encodings and character sets again or what? No, you're reading it right. There was an editorial error in the final preparation of the copy, and the wrong declaration got in. Actually, two declarations were prepared, one which referred to 10646 and defined a wide character set using ERCS syntax (or so I understand -- haven't seen it, myself), and another one which referred to ISO 646. The text originally contained the 10646 declaration, but during final preparation it became clear that the *other* declaration had to be used in processing. Between the declaration that worked, and another that did not work, there seemed to be no seriously debatable choice, and so the 646 declaration was inserted. What was overlooked in the heat of the moment (this was getting on toward the absolute deadline) was that the machine in question was an ASCII machine, not a Unicode machine; of course the 646 declaration was required for processing. But the canonical SGML declaration should use Unicode. I thought this was reasonably widely mentioned at SGML '96 and elsewhere, but perhaps it wasn't noted widely enough. Tim and I really need to get a revision up on the Web sites, with all the corrections people have been sending recently (keep'em coming!) -C. M. Sperberg-McQueen
Received on Monday, 13 January 1997 19:45:26 UTC