Re: questions about entities and entity declarations

At 07:36 PM 9/23/96 CDT, Michael Sperberg-McQueen wrote:
>Questions about entity structure and entity declarations.
>
>* should XML require all entities to be synchronous with the document's
>logical structure?

While I firmly believe in SGML's complete separation of storage from
semantics (and thus, believe that there should be no required restriction
on the alignment of elements with entities), requiring alignment of
entities with elements in XML is probably a useful simplification that does
not introduce any incompatibilities with SGML (in the XML-to-SGML direction).

[...]

>  * whether XML should retain 'bracketed text' entities (STARTTAG,
>    ENDTAG, MS, MD) (10.5.4)

These are mostly useful for doing short reference maps (although they could
be useful for doing markup examples). I would have no objection to dropping
them.

>* should XML prescribe the use of an ENTITY-END character as the
>canonical method of handling entity boundaries, as a way of simplifying
>exposition and implementation (6.2.2)?

I'm not sure what this question means. There is no ENTITY END *character*
in SGML--it is a signal from the entity manager to the parser.

>* should XML retain or relax SGML's prohibition on ENTITY attributes
>referring to SGML text entities (7.9.4.3)?

Retain. SGML text entities have no meaningful existence except as
fragements of SGML document strings, therefore it cannot make sense to
refer to one from an entity attribute.  

I would suggest that XML should further restrict NDATA, (external) CDATA,
and SUBDOC entity references to ENTITY attributes.

>* if XML makes DTDs optional and allows partial DTDs, what must or may a
>parser do when it encounters references to undeclared entities (9.4)?
>Should XML declare any set of entities automatically?

Hmmm. Good question.

>* if XML uses ISO 10646, should there be a special form of character
>reference using hexadecimal, not decimal, numbers, since most references
>to ISO 10646 and Unicode use hex, not decimal (9.5)?

Such references would make processing of XML documents with SGML tools
impossible without preprocessing.  It would probably be useful for SGML to
allow hexidecimal numeric character references, though.

>(So references to schwa could take a form like &u0259; or &x0259;, not
>*#601;, which is rather error-prone, given that nothing in the Unicode
>documentation gives decimal numbers for the character positions.)

Don't you have a hex-to-decimal calculator? :-)

>* Should XML remove SGML's prohibition on ENTITY attributes for
>notations (11.4.1)?

Do you mean that XML should *allow* ENTITY attributes for notations (also,
IDREF(S) and NOTATION)?  I would say yes, because it only affects the DTD
and it's something that will almost certainly be in the revised SGML.

Cheers,

E.
--
W. Eliot Kimber (kimber@passage.com) 
Senior SGML Consultant and HyTime Specialist
Passage Systems, Inc., (512)339-1400
10596 N. Tantau Ave., Cupertino, CA 95014-3535 (408) 366-0300, (408)
366-0320 (fax)
2608 Pinewood Terrace, Austin, TX 78757 (512) 339-1400 (fone/fax)
http://www.passage.com (work) http://www.drmacro.com (home)
"If I never had existed, would you still remember me?..."
                                   --Austin Lounge Lizards, "1984 Blues"

Received on Tuesday, 24 September 1996 00:27:28 UTC