W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > June 2010

[Bug 9985] New: [parser] How to parse </foo </bar>

From: <bugzilla@jessica.w3.org>
Date: Tue, 22 Jun 2010 21:37:58 +0000
To: public-html-bugzilla@w3.org
Message-ID: <bug-9985-2486@http.www.w3.org/Bugs/Public/>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=9985

           Summary: [parser] How to parse </foo </bar>
           Product: HTML WG
           Version: unspecified
          Platform: All
               URL: http://www.macruby.org/
        OS/Version: All
            Status: NEW
          Severity: critical
          Priority: P1
         Component: HTML5 spec (editor: Ian Hickson)
        AssignedTo: ian@hixie.ch
        ReportedBy: w3c@adambarth.com
         QAContact: public-html-bugzilla@w3.org
                CC: hsivonen@iki.fi, mike@w3.org, public-html@w3.org


WebKit received a bug report [1] about a layout problem on
http://www.macruby.org/ due to the HTML5 parsing algorithm.  (You can
visit the site in a Firefox or WebKit nightly build to see the issue.)
 The trouble boils down to this reduction:

Should say PASS:
<div>
 <div style="visibility:hidden">
   <p></p
 </div>
 PASS
</div>

Essentially, the missing ">" on the close tag of the <p> element
causes the tokenizer to consume the </div> characters as well,
resulting in the wrong DOM.  According to my tests, both the legacy
WebKit parser and the legacy Firefox parser terminate a tag token upon
encountering a "<" character.  The HTML5 spec recognizes that case as
a parse error, but has different error recovery.  (This issue is on
our "top five" list of behavioral differences likely to cause
compatibility problems.)

Is there a particular reason why we don't terminate start and end tag
tokens upon encountering a "<" character?

[1] https://bugs.webkit.org/show_bug.cgi?id=40961

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Tuesday, 22 June 2010 21:37:59 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 22 June 2010 21:37:59 GMT