- From: Simon Pieters <simonp@opera.com>
- Date: Wed, 07 Jan 2015 08:52:46 +0100
- To: "whatwg@whatwg.org" <whatwg@whatwg.org>, "Julian Reschke" <julian.reschke@gmx.de>
On Tue, 06 Jan 2015 08:35:54 +0100, Julian Reschke <julian.reschke@gmx.de> wrote: > On 2014-12-11 09:09, Simon Pieters wrote: >> The spec's parsing rules of meta refresh causes infinite reloading on >> some pages. In particular, the spec requires the "url=" to be present, >> but there are pages that omit it. IE9 also requires "url=" apparently. >> Gecko/Blink/WebKit allow "url=" to be omitted. >> >> For example, there is http://www.only-for-winners.com/ which has >> >> <meta http-equiv="refresh" >> content="0;http://www.aldanitinetwork.com" /> >> >> Clearly this is intended to redirect, not reload the current page after >> 0 seconds. >> >> >> SELECT page, COUNT(*) AS num >> FROM [httparchive:runs.2014_08_15_requests_body] >> WHERE page = url >> AND mimeType CONTAINS "html" >> AND REGEXP_MATCH(LOWER(body), >> r"<meta\s+[^>]*http-equiv\s*=\s*[\"']?refresh") >> AND REGEXP_MATCH(LOWER(body), >> r"<meta\s+[^>]*content\s*=\s*[\"']?\s*\d+\s*;\s*[^\"'>]") >> AND NOT REGEXP_MATCH(LOWER(body), >> r"<meta\s+[^>]*content\s*=\s*[\"']?\s*\d+\s*;\s*url=") >> GROUP BY page >> >> 23 rows. >> >> I also noticed that Gecko allows the number to be omitted. I only found >> one page doing that and it was using <meta http-equiv="refresh" >> content=";URL="> so it seems we can fail parsing for that case. >> > > I hear (a) these pages have been broken in IE for a long time, and (b) > only 23 (?) pages in your DB are found. Right. > So why not just leave them broken? It's a worse user experience and it's a shorter path to interop to change IE. -- Simon Pieters Opera Software
Received on Wednesday, 7 January 2015 07:51:58 UTC