- From: Michael Sokolov <sokolov@falutin.net>
- Date: Mon, 17 Dec 2012 09:35:39 -0500
- To: John Cowan <cowan@mercury.ccil.org>
- Cc: James Clark <jjc@jclark.com>, public-microxml@w3.org
John - for those of us not fully steeped in the mysteries of tag soup, would you mind providing an example where the ReStartable property is useful? It seems as if it would allow recovery when there is an intrusion of an otherwise-unacceptable child element - is this for HTML P, eg? -Mike On 12/17/2012 9:24 AM, John Cowan wrote: > James Clark scripsit: > >> Comments are welcome. > I should like to propose that the section "Start- and end-tag matching" > be replaced with the following more complicated mechanism. It is a > stripped-down version of TagSoup's algorithm, and will take advantage of > element relationships and properties derived from schemas or elsewhere, > if they are available. These relationships and properties are given > BiCapitalized names here. By default there are no relationships and > no properties. > > 1) The start-tags and end-tags are given a single scan in document > order, inserting and deleting as we go in the following ways. A stack > is maintained of currently open elements, and a queue is maintained of > elements not currently open that are to be opened as soon as possible. > > 2) When the start-tag of an element that is a PossibleChild of the > currently open element is seen, the element is pushed on the stack. > Whenever the queue is non-empty, and the front element is a PossibleChild > of the newly opened element, the front element is removed from the queue > and a start-tag is generated for it. This is iterated until the queue > is empty or the front element is not a PossibleChild. > > 3) When the start-tag of an element that is not a PossibleChild of the > currently open element is seen, an end-tag for the current element is > inserted and it is removed from the stack. This is done recursively > until the start-tag is a PossibleChild, or all elements except the root > element have been closed. If an element being closed has the ReStartable > property, its start-tag with all attributes is pushed on the front of > the queue. Then the element is pushed on the stack. > > 4) However, when the start-tag of an element that is not a PossibleChild > of *any* currently open element is seen, then if the element has a > PreferredParent, a start-tag for that element with no attributes is > pushed on the stack. This is done recursively until an element without > a PreferredParent is found. Then the element is pushed on the stack. > > 5) An end-tag with no corresponding open start-tag is deleted with no > effect on the stack or queue. > > 6) An end-tag with a corresponding open start-tag inserts end-tags to > close all currently open elements, removing them from the stack, until and > including the corresponding start-tag. However, if any generated end-tags > are for elements that have the ReStartable property, those elements with > all their attributes are pushed onto the front of the queue as well. > >
Received on Monday, 17 December 2012 14:36:39 UTC