[Bug 9985] [parser] How to parse </foo </bar>

http://www.w3.org/Bugs/Public/show_bug.cgi?id=9985





--- Comment #7 from Adam Barth <w3c@adambarth.com>  2010-09-15 19:25:07 ---
My thinking on this topic has evolved a bit since I filed this bug report. 
Here are my current thoughts:

== New data ==

1) This authoring error is somewhat common (at least based on data from greping
the web), but different pages expect browsers to tokenize these cases different
based on their audience.  The majority of pages on the public web (and like in
most intranets) expect browsers to tokenize these cases in the IE way.

2) The cases we've seen where pages expect us to tokenize these cases in the
WebKit way have almost exclusively been content authored only for WebKit.  For
example, I filed this report in response to a bug on macruby.org, which is a
web site hosted by Apple for folks with Mac computers, which are more likely to
run WebKit than to run IE.

3) The biggest problems with the spec's current tokenization have been in Mac
applications that use WebKit internally to render internally generated content.
 Most notably, AIM on Mac appears to rely on WebKit's legacy tokenization
behavior to function.  Apple also has problems on its internal web sites
because those web sites are almost exclusively viewed using WebKit.

4) In the rare cases we've seen on the public web where the spec's tokenization
cases problems, it's been easy to evangalize.  I attribute this to two reasons:
  A) The busted markup looks dumb.  In one case the author even apologized for
making that mistake.
  B) The pages are busted in the same way in IE.  By and large, folks want
their site to work in IE.

5) The area I'm most concerned about is the mobile web, by which I mean web
sites designed and optimized for mobile browsers.  Until recently, mobile
browsing was a pretty serious WebKit monoculture, which means these pages are
likely to expect the legacy WebKit tokenization.

== Takeaways ==

A) It seems likely the spec's current tokenization in this case is more
compatible with both the public web and with private intranets than the legacy
WebKit-behavior.

B) It seems likely the spec's current tokenization in this case is less
compatibile with the mobile web than the legacy WebKit-behavior.

C) In order to not break legacy Mac applications that use WebKit internally,
Apple will need to ship the legacy WebKit tokenization to these applications. 
(My understanding is that it will be limited to the specific applications and
versions that are problematic.)

== Conclusions ==

This is going to cause pain either way.  We should pick a path that minimizes
long-term pain here, even if it means more pain in the short term.  I believe
the best road to less long-term pain is to match IE's tokenization in this
case, in no small part because I think it's unlikely that IE will change their
tokenization for the foreseeable future.

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Wednesday, 15 September 2010 19:25:13 UTC