RE: Error handling: yes, I did mean it

> From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>
> 
> ISO 8879 does require a left-to-right parse.  XML does not explicitly
> require LR parsing directly, and I'm not aware of anything in the
> spec which either (a) requires LR parsing to work right or (b)
> becomes nonsensical if parsing is non-directional.  If I'm right,
> it would be possible (though pointless, I think) to use a
> non-directional or right-to-left parsing algorithm on XML.
> 
> ><foo>
> ><bar>coasters, beer mugs, ashtrays</bar>
> ><bear>fur, growl</bear>
> ><porridge>oats, cream, brown sugar</porridge>
> ></foo>
> >
> >would it not be possible to construct a grove starting from
> ><bear> and exploring its context?  or by parsing <porridge>,
> >then <bear>, then <bar>?  or it is required that one start
> >with <foo>?  If not, the requirement that the parser
> >die at the first error does not ensure that various parsers
> >will die after having sprouted identical groves, which I take
> >to be the intended functional effect of the Draconian stance.
> 
> It is indeed possible to construct a parse starting at some
> point other than the left-most edge of the data stream.

It would be feasible to write a parser that scans left to right
for the tags (checking the element names only) and builds a tree
which it then processes in a second pass in a top-down recursive
(or potentially other-ordered) manner.  Only during this second
pass would errors within data content, entities, or attribute
specifications be discovered, and the order in which they are
discovered is not necessarily in (left to right) document order.

If, in fact, the processor knows it is reading this document entity
only to link to the first bar element in the fifth foo in the second
chapter of this document, then the second pass only needs to look at
this subtree (until, say, the user pages or links to another part of
the document).  In this case, an error in the first chapter (other
than one that is caught during the initial tree-formation pass) may
never be discovered.

paul

Received on Thursday, 1 May 1997 11:55:08 UTC