RE rules - summary of corrections

After some offline discussions with Charles Goldfarb and James Clark, I
have revised my attempt at a restatement of the RE rules of clause
7.6.1., and will re-post it, for the edification of the participants, in
a moment.  Mostly the changes make it clearer, I hope, but there was one
outright error, namely the claim that everything in an SGML document is
either markup or content -- since the start- and end-tags of subelements
are both markup and part of the content of the parent element, the
correct opposition is not markup and content, but markup and non-markup,
or markup, data, and separators.

In sum, the rules prescribed by 8879 are these.  RE is insignificant
(i.e. not passed to any downstream application, not part of the
XML grove plan) when it occurs in any of the following patterns:

  start-tag nondata* RE
  RE nondata* end-tag
  RS nondata+ RE

where non-data is defined this way:

  nondata ::= comment declaration
             | processing instruction
             | character reference
             | entity reference
             | entity-end
             | marked section declaration
             | included subelement
             | short reference
             | shortref use declaration
             | link set use declaration

The element Q contains no REs in any of the following cases:

  <q>
  Listen to my heart beat.
  </q>

This is the simple case:  RE adjacent to a start-tag or end-tag.

  <q>
  <!-- sound track is silent -->
  Listen to my heart beat <!-- --
  ><?DIRECTOR begin: audio>
  and beat and beat and beat.
  </q>

Here rule (a) takes care of line 1, rule (c) of line 2, the comment of
line 3, rule (c) again of line 4, and rule (b) of line 5.

  <q><!-- sound track is silent -->
  Listen to my heart beat.
  </q>

This is the one case I can think of where the first RE is not
actually adjacent to the start-tag.

-C. M. Sperberg-McQueen

Received on Tuesday, 24 September 1996 19:27:13 UTC