The RE rules in 14 lines

James Clark and I have prepared this concise definitive specification
of the rule for determining insignificant REs in data, with both XML
and SGML variants. It is based on Michael Sperberg-McQueen's clever
"nondata" formalism, which replaces a great deal of confusing text in
8879. I intend to propose this to WG8 at the November meeting.
(Note that for XML, the rule is 14 lines long, 9 of them formal.)


For XML and SGML:

An RE in data is insignificant (i.e. not passed to an application,
which is to say, not part of the grove) when it occurs in any of the
following patterns:

  start-tag  nondata*  RE
  RE         nondata*  end-tag
  RS         nondata+  RE

In applying this rule, a reference is transparent; only its
replacement is considered.

For XML only:

  nondata ::=
               comment declaration
             | processing instruction

  reference ::=
               character reference
             | entity reference


For SGML only:

  nondata ::=
               comment declaration
             | processing instruction
             | marked section declaration start
             | marked section end
             | included subelement
             | shortref use declaration
             | link set use declaration

  reference ::=
               character reference
             | entity reference
             | short reference

  marked section declaration start ::=
               marked section start
             , status keyword specification
             , dso

The rule is applied recursively to the data of included subelements.

--
Charles F. Goldfarb * Information Management Consulting * +1(408)867-5553
           13075 Paramount Drive * Saratoga CA 95070 * USA
  International Standards Editor * ISO 8879 SGML * ISO/IEC 10744 HyTime
 Prentice-Hall Series Editor * CFG Series on Open Information Management
--

Received on Thursday, 26 September 1996 15:52:30 UTC