- From: Dave Peterson <davep@acm.org>
- Date: Wed, 30 Apr 1997 10:52:15 -0400
- To: <w3c-sgml-wg@w3.org>
At 8:29 PM 4/29/97, Jean Paoli wrote: >Second possibility: >If there is a strong agreement that the XML syntax is too rigid, let us >change >the XML syntax. This is what I understand when I hear people complaining >about things like: >"<a><b>xxxx </a>" >by saying " this is obvious that it means <a><b>xxx</b></a>" >So technically, this is not error recovery. > >For example, we can state that if: >1/ A tag is not closed > <a> <b> xxxx </a> > <b> is automatically closed before <a> is closed >2/ A tag is not closed and we hit EOF > <a> <b> xxxx </b>EOF > <a> is automatically closed >3/ Extra end tags > <a><b> xxxx </b> </c> </a> > </c> are skipped >4/ Etc etc ..... you see the problem here. It seems to me that it is >very defficult to propose >easy rules. But I am open to any suggestion. It is indeed "very difficult to propse easy rules", especially if the rules are to be useful. You rapidly come up with SGML's OMITTAG or something morally equivalent thereto. While not described this way, OMITTAG actually functions as rules of the form: "If you hit a tag or data character that is not permitted by the content model of the element you are in, here is the sequence of things to try to recover, before giving up and declaring an error." OMITTAG is arguably just another form of legitimized error recovery. And indeed, I've seen SGML parsers whose error recovery seems to be just that: Apply the OMITTAG rules even when the DTD says NO. Dave Peterson SGMLWorks! davep@acm.org
Received on Wednesday, 30 April 1997 10:53:07 UTC