Re: ERB meeting, 30 October 1996

On Wed, 30 Oct 1996 23:41:40 -0500 <DAVEP@acm.org> said:
>I just read Lee's analysis of the fragility of the NET trickeries
>proposed for XML (message resent as <199610310415.XAA24860@www19.w3.org>).

More on that in another posting.

>All this trickery does indeed get very fragile.  There is a proposal
>that will probably be considered at the combined ISO/ANSI meeting
>the week before SGML '96 that proposes a neat solution ...
>
>Rather than slavishly trying to jam XML into the existing SGML syntax,
>which wasn't intended to cope with the things we are here trying to
>deal with, couldn't we design things like this in consultation with
>the SGML RG and have a sensible syntax for both XML and SGML/Revised?
>Then get busy and push/help the revision to get finalized.  We'll
>get a much better XML and probably a better SGML as well.

To the extent possible, that is what the ERB is trying to do (i.e.
that is why so many of the revision group are members of the SGML
WG); there are unavoidable constraints, however, imposed by the fact
that the ERB has a draft to deliver at SGML '96, and so we need to
make decisions for XML 1.0 now, not later.

The friendly informal exchange of views and of information about
where WG8 members believe the revision is headed have been, and will
be, very useful in the preparation of the XML spec; we hope that the
XML spec will be similarly useful in the SGML revision.

Concretely, with respect to the question of recognizing elements
declared EMPTY without the DTD, there have been two main types of
delimiter-tweaking proposals.  First, the proposal discussed a few
weeks back on comp.text.sgml, involving four delimiter roles:  STAGO,
ETAGO, STAGC, and ETAGC, with empty elements being marked by using
STAGO, ETAGC.  This would allow Dave's

  <x/>..<e>..</x>

as well as Arjun Ray's

  <x{...<e>...}x>

Second, there have been various proposals which involve marking the
tag-open or tag-close delimiter of EMPTY elements specially, which
would involve a different set of delimiter roles:  the existing
STAGO, ETAGO, and TAGC, supplemented by an tag-open delimiter for
empty elements, a tag-close delimiter for empty elements, or both.

Both the <e/> form adopted by the ERB yesterday, and the <@e> form
suggested by Lee Quin, are compatible with the second, but not with
the first, solution.

The easiest way for the SGML revision to ensure compatibility with
all the various proposals is for the revision to define six tag
delimiter roles, for the open and close delimiters of the three
different types of tags.  If we call these start-tags, end-tags,
and point-tags, we can name the roles

     STAGO  ETAGO  PTAGO
     STAGC  ETAGC  PTAGC

Then ISO 8879-86 has
     STAGO = PTAGO = '<'
     ETAGO = '</'
     STAGC = ETAGC = PTAGC = '>'

Dave Peterson's illustration has:
     STAGO = PTAGO = '<'
     ETAGO = '</'
     STAGC = '/>'
     ETAGC = PTAGC = '>'

While Arjun Ray and Charles Goldfarb's illustration is the same
except for using '{' '}' for '/>' '</'.

Lee Quin's proposal has:
     STAGO = '<'
     PTAGO = '<@'
     ETAGO = '</'
     STAGC = ETAGC = PTAGC = '>'

And XML, as currently defined, has
     STAGO = PTAGO = '<'
     ETAGO = '</'
     STAGC = ETAGC = '>'
     PTAGC = '/>'

Conclusion:  three delimiters is not enough; each modification so
far proposed requires only four but not the same four; six delimiter
roles suffice to express each modification so far proposed.

With regard to the specific proposal Dave Peterson mentions:  while
much taken with the symmetry of the example given, the ERB is
concerned about the implications of requiring every start-tag in
existing SGML documents to be modified.  (It's bad enough requiring
all the EMPTY-element tags to be modified!)  So while we are happy
to contemplate the prospect of the SGML revision distinguishing six
delimiter roles instead of the current three, the ERB is more likely
to content itself with a proposal which is consistent both with a
six-delimiter version of SGML and with the existing SGML '86.

-C. M. Sperberg-McQueen

Received on Thursday, 31 October 1996 16:31:28 UTC