- From: Michael Sperberg-McQueen <U35395@UICVM.CC.UIC.EDU>
- Date: Tue, 24 Sep 96 11:41:47 CDT
- To: "W. Eliot Kimber" <kimber@passage.com>, W3C SGML Working Group <w3c-sgml-wg@w3.org>
On Tue, 24 Sep 1996 00:28:04 -0400 Eliot Kimber said: >At 07:36 PM 9/23/96 CDT, Michael Sperberg-McQueen wrote: >>* should XML prescribe the use of an ENTITY-END character as the >>canonical method of handling entity boundaries, as a way of simplifying >>exposition and implementation (6.2.2)? > >I'm not sure what this question means. There is no ENTITY END >*character* in SGML--it is a signal from the entity manager to the >parser. Clause 6.2.2 says in part: "A system can represent an Ee in any manner that will allow it to be distinguished from SGML characters. NOTE -- For example, an Ee could be represented by the bit combination of a non-SGML character, if any have been assigned." Sorry for introducing confusion by using the term "an ENTITY-END character" -- I meant, of course, "the bit pattern of a non-SGML character" in the sense of the note. If, for example, control-Z is declared a NONSGML character, it is easy to describe the behavior of EE by saying the entity manager inserts a control-Z at the end of an entity; this helps ensure that the parser doesn't falsely recognize < as STAGO; the parser or someone downstream eventually strips all control-Zs; implementations can do whatever they like, as long as they behave *as if* this were what they did. I think describing EE behavior in terms of an EE character is likely to be significantly simpler than describing it in terms of a non-character signal, particularly to programmers weaned on C's treatment of newline and EOF. It may also simplify implementation. >>* should XML retain or relax SGML's prohibition on ENTITY attributes >>referring to SGML text entities (7.9.4.3)? > >Retain. SGML text entities have no meaningful existence except as >fragements of SGML document strings, therefore it cannot make sense to >refer to one from an entity attribute. This logic eludes me completely. The premise is false, since meaningful existence can be defined by an application in its own terms; an application doesn't need our permission to assign meaning to a text entity. And even if the premise were true, the conclusion doesn't follow. I might wish to point to an external entity which contains an alternative rendition text for the element, which has a fragment of an SGML document which can meaningfully be substituted for the content of the element. Where is the problem? >>* if XML uses ISO 10646, should there be a special form of character >>reference using hexadecimal, not decimal, numbers, since most references >>to ISO 10646 and Unicode use hex, not decimal (9.5)? > >Such references would make processing of XML documents with SGML tools >impossible without preprocessing. It would probably be useful for SGML to >allow hexidecimal numeric character references, though. > >>(So references to schwa could take a form like &u0259; or &x0259;, not >>*#601;, which is rather error-prone, given that nothing in the Unicode >>documentation gives decimal numbers for the character positions.) > >Don't you have a hex-to-decimal calculator? :-) Yes, I do; that's how I know its use is error-prone. Use of hex references requires no more preprocessing than we've already decided on, namely preparing an appropriate prolog, which would in this case contain at least <!ENTITY u0259 'ə'>
Received on Tuesday, 24 September 1996 12:58:26 UTC