W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > September 1996

The RE rules in 14 lines

From: Charles F. Goldfarb <Charles@SGMLsource.com>
Date: Thu, 26 Sep 1996 19:54:25 GMT
To: Michael Sperberg-McQueen <U35395@UICVM.CC.UIC.EDU>
Cc: W3C SGML Working Group <w3c-sgml-wg@w3.org>
Message-ID: <324a7091.15113107@mail.alink.net>
James Clark and I have prepared this concise definitive specification
of the rule for determining insignificant REs in data, with both XML
and SGML variants. It is based on Michael Sperberg-McQueen's clever
"nondata" formalism, which replaces a great deal of confusing text in
8879. I intend to propose this to WG8 at the November meeting.
(Note that for XML, the rule is 14 lines long, 9 of them formal.)


For XML and SGML:

An RE in data is insignificant (i.e. not passed to an application,
which is to say, not part of the grove) when it occurs in any of the
following patterns:

  start-tag  nondata*  RE
  RE         nondata*  end-tag
  RS         nondata+  RE

In applying this rule, a reference is transparent; only its
replacement is considered.

For XML only:

  nondata ::=
               comment declaration
             | processing instruction

  reference ::=
               character reference
             | entity reference


For SGML only:

  nondata ::=
               comment declaration
             | processing instruction
             | marked section declaration start
             | marked section end
             | included subelement
             | shortref use declaration
             | link set use declaration

  reference ::=
               character reference
             | entity reference
             | short reference

  marked section declaration start ::=
               marked section start
             , status keyword specification
             , dso

The rule is applied recursively to the data of included subelements.

--
Charles F. Goldfarb * Information Management Consulting * +1(408)867-5553
           13075 Paramount Drive * Saratoga CA 95070 * USA
  International Standards Editor * ISO 8879 SGML * ISO/IEC 10744 HyTime
 Prentice-Hall Series Editor * CFG Series on Open Information Management
--
Received on Thursday, 26 September 1996 15:52:30 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 10:03:23 EDT