[Bug 9985] [parser] How to parse </foo </bar>

http://www.w3.org/Bugs/Public/show_bug.cgi?id=9985


Henri Sivonen <hsivonen@iki.fi> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bzbarsky@mit.edu




--- Comment #4 from Henri Sivonen <hsivonen@iki.fi>  2010-09-15 11:36:04 ---
Hixie, do you have quantitative data showing that doing this the IE way is
better than doing this the old WebKit way (or the old Gecko way, which was
slightly different)?

Now it appears that Apple is adding app-specific hacks to the system WebKit
because of this. It's really sad if any vendor has to add app-specific hacks
because of this, but at least Microsoft already has chosen to bear the engine
versioning burden independently of this issue.

Interesting bits from IRC (see http://krijnhoetmer.nl/irc-logs/whatwg/20100915
)

    # [11:27] <hsivonen> othermaciej: what HTML5 parsing difference from old
WebKit is breaking mail apps with system WebKit?
    # [11:27] <othermaciej> hsivonen: <foo<foo>
    # [11:28] <hsivonen> othermaciej: how does Outlook deal?

    # [11:28] <hsivonen> does Outlook use the Word engine these days? does Word
parse differently from Trident?

    # [11:28] <othermaciej> I believe Outlook uses the Word engine

    # [11:30] <hsivonen> othermaciej: it would good to know if the emails are
generated by an email app in the wild or if they are hand-crafted
advertisements

    # [11:30] <othermaciej> I believe at least some of them were produced by an
automated reporting system of some kind

    # [11:33] <hsivonen> annevk: wouldn't it then make sense to pick the
solution that sucks less for editors?
    # [11:34] <hsivonen> now we've picked the solution that makes the tokenizer
code simpler

    # [11:50] <zcorpan_>
http://html5.org/tools/web-apps-tracker?from=899&to=902 even
    # [11:56] <zcorpan_>
http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2007-June/011804.html

    # [11:57] <zcorpan_> iirc, it was a web compat requirement to not close
script for </script<div>

    # [11:58] <othermaciej> annevk: it's by far the top source of breakage for
us (other than just plain implementation bugs, which are mostly now fixed)

    # [12:06] <zcorpan_>
http://www.gearthblog.com/blog/archives/2006/06/more_detail_on.html has </ul
<div> (and looks broken with html5 parser)

    # [12:07] <Philip`> http://philip.html5.org/data/gt-in-tag.txt has
<foo<foo>s in case anyone is looking for those
    # [12:07] <Philip`> (Ignore the filename, it lies)


    # [12:20] <annevk> e.g. for http://pageranking.cbgw-lensahn-slh.de/ HTML5
is better

    # [12:24] <othermaciej> some bad content had this in it: <style
type='text/css'td{width='60%' cellpadding='20%'}</style>
    # [12:25] <othermaciej> which ate the rest of the page instead of making an
empty style element with some bogus attributes

    # [12:38] <jgraham> You need to weight by badness of the problem of course
    # [12:39] <jgraham> Like eating the whole page on a few pages is worse than
slight issues on more pages

    # [12:46] <zcorpan_> if we change this, we need to investigate carefully
what to change to. old webkit and gecko don't agree in all cases (iirc) and
they don't make </script<div> close the script, iirc
    # [12:46] <hsivonen> what did Opera do in 2006?
    # [12:47] <hsivonen> it would suck to use circular reasoning to make HTML5
do something, because of Opera if Opera changed to match HTML5

    # [12:50] <zcorpan_> oh, previously we parsed <p<div> as <p <div=""> i.e.
with an attribute "<div"
    # [12:52] <zcorpan_> so we were still closer to ie than gecko and webkit
for both <p<div> and <p <div>
    # [12:54] <zcorpan_> we fixed that in 2008 to match ie and html5
    # [12:59] <hsivonen> https://bugzilla.mozilla.org/show_bug.cgi?id=507498
    # [13:00] <hsivonen> https://bugzilla.mozilla.org/show_bug.cgi?id=510252
    # [13:00] <hsivonen> https://bugzilla.mozilla.org/show_bug.cgi?id=523516
    # [13:00] <hsivonen> https://bugzilla.mozilla.org/show_bug.cgi?id=543652
    # [13:01] <hsivonen> https://bugzilla.mozilla.org/show_bug.cgi?id=590416

    # [13:11] <zcorpan_> also see
http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2007-June/011891.html

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Wednesday, 15 September 2010 11:36:08 UTC