ERB decisions on A.17, B.9, and other questions

The SGML ERB met today, Saturday Oct. 19th, and voted on several items
already discussed by the SGML WG.  Participating:  Bosak, Clark, Kimber,
Maler, Paoli, Sharpe, Sperberg-McQueen.  Absent:  Bray (represented in
part by written votes on open issues), DeRose, Hollander, Magliery.  All
decisions were by consensus of all those participating in the call, and
thus carry a majority of the membership of the ERB.

I should note that the wording of the rationales given below reflects
the understanding, and is the responsibility, of the author.   The
rationales have not been reviewed or approved by the ERB; they are thus
subject to correction when I have misunderstood or misstated the ERB's
intention.

The ERB agreed on the following position statements:

  * We will maintain a list of topics to be addressed in
    future revisions of XML.
  * We will include some version of this list in the specification
    itself.

The rationale for the list and for its inclusion in the specification
is to allow some topics to be postponed until there is more time for
their resolution, and to inform users of XML of the expected lines
of development.

In the light of these agreements, the ERB reconfirmed its earlier
decision that XML 1.0 will not have SDATA entities.  It is thought that
most uses of SDATA entities are adequately served by character
references to Unicode characters (see example below).  Techniques for
dealing with non-Unicode characters, specification of glyphs rather than
characters, and related topics (such as possible mechanisms for document
private agreements governing the ISO 10646 Private Use Areas) will be
addressed in future revisions.

Instead of a declaration like

  <!ENTITY auml SDATA "[auml    ]">

any XML processor can work properly with a declaration of the form

  <!ENTITY auml "&#228;"> <!-- auml = a umlaut, U+00E4 -->

On question A.17 (Should XML have entities or not?), the ERB had
already decided that XML would have internal entities (either text
or CDATA, not both).  Today we decided further:

  * XML will have external NDATA entities.
  * XML will have internal text entities.

The rationale for allowing internal text entities was this:  CDATA
entities are very easy to implement (because they need not be
expanded at parse time, but can be expanded later without changing
the structure of the parse tree); text entities are more complex
(if they are synchronous, they may require the replacement of a leaf
node with an arbitrarily complex subtree; if they are asynchronous,
they *must* be expanded at parse time and complicate the parser).
Nevertheless, internal text entities are so useful to the user that
they justify the cost of implementation.

Whether XML will have external text entities remains an open question.

On question B.9, the ERB decided:

  * In version 1.0, XML will not have public identifiers, only
    system identifiers.
  * In version 1.0, system identifiers will be URLs.
  * In version 1.0, URLs need not carry the FSI-style <url> label.

Whether system identifiers in XML 1.0 will be *allowed* to carry the
<url> label remains an open question.

Addition of public identifiers and extension of system identifiers
to other formats will be taken up in preparation of future versions
of XML.

The rationale for these decisions was that URLs are well understood
and well established, and can handle both remote and local addresses.
Restricting external identifiers to URLs helps keep the specification
simple.  In the long run, however, public identifiers are desired by
many users and may provide solutions to the well known fragility
problems associated with URLs.  Better infrastructure, in the form
of catalog management tools and http-based catalog resolution
services, would help make the introduction of public identifiers into
XML smoother.

-C. M. Sperberg-McQueen

Received on Saturday, 19 October 1996 14:14:09 UTC