RE: Error handling: yes, I did mean it

At 8:29 PM 4/29/97, Jean Paoli wrote:

>Second possibility:
>If there is a strong agreement that the XML syntax is too rigid, let us
>change
>the XML syntax. This is what I understand when I hear people complaining
>about things like:
>"<a><b>xxxx </a>"
>by saying " this is obvious that it means <a><b>xxx</b></a>"
>So technically, this is not error recovery.
>
>For example, we can state that if:
>1/ A tag is not closed
>   <a> <b> xxxx </a>
>   <b> is automatically closed before <a> is closed
>2/ A tag is not closed and we hit EOF
>   <a> <b> xxxx </b>EOF
>  <a> is automatically closed
>3/ Extra end tags
>    <a><b> xxxx </b> </c> </a>
>   </c> are skipped
>4/ Etc etc .....  you see the problem here. It seems to me that it is
>very defficult to propose
>easy rules. But I am open to any suggestion.

It is indeed "very difficult to propse easy rules", especially if the
rules are to be useful.  You rapidly come up with SGML's OMITTAG or
something morally equivalent thereto.

While not described this way, OMITTAG actually functions as rules of
the form:  "If you hit a tag or data character that is not permitted
by the content model of the element you are in, here is the sequence
of things to try to recover, before giving up and declaring an error."

OMITTAG is arguably just another form of legitimized error recovery.
And indeed, I've seen SGML parsers whose error recovery seems to be
just that:  Apply the OMITTAG rules even when the DTD says NO.

Dave Peterson
SGMLWorks!

davep@acm.org

Received on Wednesday, 30 April 1997 10:53:07 UTC