RE: SD1 - Short End Tags from Arjun Ray on 1997-05-17 (w3c-sgml-wg@w3.org from May 1997)

From: Arjun Ray <aray@q2.net>
Date: Sat, 17 May 1997 02:04:29 -0400 (EDT)
To: w3c-sgml-wg@w3.org
Message-ID: <Pine.LNX.3.95.970517014258.14234B-100000@q2.net>

On Fri, 16 May 1997, Andrew Layman wrote:

> Arjun Ray wrote, regarding the proposal for short end tags:
> 
> >Yup. Allowing both forms actually opens the door for OMITTAG to be
> >reconsidered also...
> 
> There is an important difference: The element name in an end tag is
> completely redundant; that is, the structure of a document can be
> discovered without the element name and without reference to any other
> documents. An omitted end tag requires access to the schema. If end tags
> per se were omitted, it would be impossible to parse the document
> without a schema.

Not quite. That is, the parse is not impossible, it is merely ambiguous
under the usual interpretation of SGML rules sans DTD, e.g. in

   <foo>123<bar>456<baz>789</foo>

there are two places at which an omitted </bar> could be infered. However,
this can also be read as the lack of a sufficiently strong rule to begin
with. In fact, for an XML parser, the issue of where to put the </bar>
doesn't arise *until* the </foo> is encountered -- at which point a WF
violation is detected under the current specJ. But if </foo> were
interpreted as a Super Right Parenthesis forcing the close of all
"revealed" open subelements, then the implied normalization

  <foo>123<bar>456<baz>789</baz></bar></foo>

would be a disambiguation *by rule*, not to mention doing away with the
redundancy you mention via a *syntactic* reason for endtags to have GIs.

The real problem here, however, is not the syntactic (in)tractability of
such a super-right-paren rule, but the ramifications of OMITTAG as per
SGML. 

Arjun

Received on Saturday, 17 May 1997 01:59:42 UTC