- From: Elliotte Rusty Harold <elharo@ibiblio.org>
- Date: Tue, 2 Mar 2010 05:44:55 -0500
The handling of processing instructions in the XHTML syntax seems reasonably well-defined; but it feels a little off in the HTML syntax. Briefly it seems that <? causes the parser to go into Bogus comment state, which is fair enough. (I wouldn't really recommend that anyone use processing instructions in HTML syntax anyway.) However the parser comes out of that state at the first >. Because processing instructions can contain > and terminate only at the two character sequence ?> this could cause PI processing to terminate early and leave a lot more error handling and a confused parser state in the text yet to come. It might be wise to add a separate processing instruction state that would consume all characters up to the first occurrence of ?> instead of reusing Bogus comment state. The parser could still emit a comment token containing the processing instruction text. The goal here is not to enable processing instructions in the HTML syntax. It's simply an effort to ensure that if one does slip in by mistake we more accurately detect what the author or generator likely intended as the end of the processing instruction. -- Elliotte Rusty Harold elharo at ibiblio.org
Received on Tuesday, 2 March 2010 02:44:55 UTC