W3C home > Mailing lists > Public > public-html-xml@w3.org > January 2011

Re: What problem is this task force trying to solve and why?

From: Henri Sivonen <hsivonen@iki.fi>
Date: Fri, 7 Jan 2011 02:26:01 -0800 (PST)
To: public-html-xml@w3.org
Message-ID: <1721551045.163212.1294395961814.JavaMail.root@cm-mail03.mozilla.org>
John Cowan wrote:
> Henri Sivonen scripsit:
> 
> > On Dec 20, 2010, at 17:50, David Carlisle wrote:
> > It sure has. Hixie ran an analysis over a substantial quantity of
> > Web pages in Google's index and found existing text/html content
> > that
> > contained an <svg> tag or a <math> tag. The justification is making
> > the algorithm not break a substantial quantity of pages like that.
> 
> A number would be nice. One person's "substantial" is another person's
> "trivial", unfortunately.

>From http://krijnhoetmer.nl/irc-logs/whatwg/20110105#l-805
# [20:50] <Hixie> hsivonen: O-of-billions, but it was a long time ago now
# [20:50] <Hixie> hsivonen: though I still have a framed printout of one of the pages I found stuck to my fridge
# [20:51] <Hixie> what's more interesting than the number of pages scanned is the fraction of pages that had issues
# [20:51] <Hixie> iirc the number was small, but non-zero
# [20:52] <Hixie> the list of elements in the spec has html comments in the source listing which elements were found to be problematic and which were added just for completeness and to make writing parsers easier
# [20:52] <Hixie> (e.g. iirc h1 was a problem but h6 probably wasn't, but someone asked that they be treated the same to make parsers simpler)

(I think I'm "someone" on the last line.)

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Friday, 7 January 2011 10:27:05 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 7 January 2011 10:27:05 GMT