W3C home > Mailing lists > Public > whatwg@whatwg.org > March 2010

[whatwg] Inconsistent behavior for empty-string URLs

From: Ian Hickson <ian@hixie.ch>
Date: Tue, 9 Mar 2010 00:40:31 +0000 (UTC)
Message-ID: <Pine.LNX.4.64.1003050242120.21376@ps20323.dreamhostps.com>
On Mon, 7 Dec 2009, Nicholas Zakas wrote:
> 
> [...] I found that there are several instances where the browser will 
> make a second [request] to the page based on resolving empty-string URLs 
> in the several tags.

On Mon, 7 Dec 2009, Aryeh Gregor wrote:
> 
> This is clearly not a good idea for <iframe>, since otherwise <iframe 
> src=""> is an instant infinite loop on a typical page.

On Tue, 15 Dec 2009, Nicholas Zakas wrote:
>
> Here's what I would propose:
> 
> Empty string attributes for HTML elements specifying resources to 
> automatically download are considered invalid and don't cause a request 
> to be sent.

On Tue, 15 Dec 2009, Jonas Sicking wrote:
> 
> I'd prefer to explicitly enumerate the elements we're talking about,
> rather than giving rules which risk being interpreted differently by
> different people. [...]
> 
> So the specific list would then be:
> 
> <img>
> <link>
> <script>
> <iframe>
> <video>
> <audio>
> <object>
> <embed>
> <source>
> <input type=image>
> 
> All of these would never attempt to fetch a resource if the src/href 
> attribute is empty (even if the current baseuri is different from the 
> document uri). However it would not act as if the attribute was not set 
> (important for <script>).

On Tue, 15 Dec 2009, Aryeh Gregor wrote:
>
> I'd say the rule should be that if the type is text/html or unknown, "" 
> should work.  If it's known to be some other type, like text/css, then 
> it should fail.  Alternatively, it should work for everything that 
> doesn't actually fetch a resource automatically.  After all, the 
> rationale for this whole change is that "" as a source for images and 
> such 1) makes no sense and is almost certainly an authoring mistake, and 
> 2) causes extra HTTP requests -- but neither is true for all <link>s.  
> For instance, <link rel=first href=""> makes perfect sense and causes no 
> extra requests, so I don't think it should be prohibited.

On Tue, 15 Dec 2009, Jonas Sicking wrote:
> 
> Interesting. I don't think we want to base it on the type attribute, 
> since that should generally be possible to leave out. But I can 
> certainly see a use for <link rel=sitemap href="">.
> 
> So maybe just apply the don't-download rule rel=stylesheet (and 
> rel="stylesheet alternate" etc).

On Wed, 16 Dec 2009, Simon Pieters wrote:
> 
> I think only icon, prefetch and stylesheet links.
> 
> The following element defines two links, one of which would be ignored:
> 
>   <link rel="icon index" href>
> 
> [<video poster>?]
> <command icon>?
> <html manifest>?
> <applet code>? (Maybe not, since it's more of a parameter to the Java plugin.)
> <frame src>?

On Thu, 17 Dec 2009, Simon Pieters wrote:
> 
> I asked Philip to provide some data about pages using empty attributes 
> for these:
> 
> <Philip`> zcorpan: http://philip.html5.org/data/empty-url-attributes.txt
> <Philip`> zcorpan: http://philip.html5.org/data/empty-url-link-attributes.txt

On Thu, 17 Dec 2009, Nicholas Zakas wrote:
> 
> <img src="">
> IE 8 and earlier: makes a request
> FF 3 and earlier: makes a request
> FF 3.5: does not make a request
> Safari 4 and earlier: makes a request
> Chrome 3 and earlier: makes a request
> Opera 10 and earlier: does not make a request
> 
> <link href="">
> IE 8 and earlier: does not make a request
> FF 3.5 and earlier: makes a request
> Safari 4 and earlier: makes a request
> Chrome 3 and earlier: makes a request
> Opera 10 and earlier: does not make a request
> 
> <script src="">
> IE 8 and earlier: does not make a request
> FF 3.5 and earlier: makes a request
> Safari 4 and earlier: makes a request
> Chrome 3 and earlier: makes a request
> Opera 10 and earlier: does not make a request
> 
> <iframe src="">
> IE 8 and earlier: does not make a request
> FF 3.5 and earlier: does not make a request
> Safari 4 and earlier: does not make a request
> Chrome 3 and earlier: does not make a request
> Opera 10 and earlier: does not make a request

On Thu, 17 Dec 2009, Simon Pieters wrote:
> 
> Is the result different if the base URL is different from the document's URL?
> Is the result different if the value is "#"?

On Fri, 18 Dec 2009, Simon Pieters wrote:
> 
> http://simon.html5.org/dump/empty-url-attributes.xml
> 
> <img src>, 3221 occurrences
> <iframe src>, 1862 occurrences
> <body background>, 1665 occurrences
> <script src>, 248 occurrences
> <embed src>, 74 occurrences
> <input src>, 55 occurrences
> <frame src>, 53 occurrences
> <video src>, 0 occurrences
> <video poster>, 0 occurrences
> <audio src>, 0 occurrences
> <object data>, 0 occurrences
> <source src>, 0 occurrences
> <command icon>, 0 occurrences
> <html manifest>, 0 occurrences
> <applet code>, 0 occurrences
> 
> http://simon.html5.org/dump/empty-url-link-attributes.xml
> 
> <link rel=icon>, 243 occurrences
> <link rel=stylesheet>, 115 occurrences
> <link rel=prefetch>, 0 occurrences

On Fri, 18 Dec 2009, Simon Pieters wrote:
> 
> I've now looked at a selection of random URLs.
> 
> Conclusion: None of these seem to need a request to be made. img should 
> fire an error event. iframe and frame should use about:blank.

On Mon, 21 Dec 2009, Nicholas Zakas wrote:
>
> Here are the results of testing various tags with empty URLs across 
> different browsers. The table below indicates how many requests are sent 
> when the given tag is encountered on the page (curiously, Firefox 3 
> sometimes sends two extra requests). Even though the <link> tags don't 
> show it in the table, they all had href="".
> 
> 				IE7	IE8	FF3	FF3.5 	SF4	Ch3	Op10
> <img src="">			1	1	1	0	1 	1	0
> <input type="image" src="">	1	1	1	0	1 	1	0
> <object data="">		0	0	1	1 	0	0	0
> <script src="">		0	0	1	1	1 	1	0
> <link rel="stylesheet">	0	0	1	1	1 	1	0
> <link rel="icon">		0	0	2	1 	1	1	0
> <link rel="shortcut icon">	0	0	2	1	1 	1	0
> <link rel="prefetch">		0	0	2	0	0 	0	0
> <iframe src="">		0	0	0	0	0 	0	0
> <embed src="">		0	0	0	0	0 	0	0
> <html manifest="">		0	0	0	0	1 	0	0
> 
> For the most part, no two browsers act the same. Safari and Chrome are 
> the closest (not surprising).
> 
> Apply a base URL via <base> in all cases didn't change the results, 
> except in IE, where it prevented the extra image request from being 
> made.

On Tue, 22 Dec 2009, Simon Pieters wrote:
> 
> Thanks. IIRC, IE doesn't make a request when using minimized attribute 
> syntax, i.e. "<img src>" (because it drops the attribute during 
> parsing).

On Thu, 7 Jan 2010, Nicholas Zakas wrote:
> 
> Given the disparate browser implementations for dealing with empty 
> string URLs, it seems unlikely that anyone is relying upon the current 
> behaviors, so I'd like to suggest this change be added to HTML5:
> 
> For any <img>, <link>, <script>, <iframe>, <audio>, <video>, <audio>, 
> <object>, <embed>, <input>, <html manifest>, or <frame> tag that will 
> result in an automatic download of an external resource must ignore any 
> empty string URL and not download the external resource. This is true 
> even when a <base href> is applied to the page.

On Mon, 7 Dec 2009, Jonas Sicking wrote:
> 
> Given that the concern is sites that accidentally leave a attribute 
> empty, wouldn't you want to prevent a request from going out even if the 
> base-uri is set? I.e. wouldn't you want to prevent a request from going 
> out for the current document:
> 
> foo.html:
> <head><base src="bar.html">
> <body><img src="">
> 
> It seems to me equally unlikely that someone would do that
> intentionally expecting a request to be sent to "bar.html"?

On Tue, 8 Dec 2009, Nicholas Zakas wrote:
>
> I'd agree with that, I've yet been able to find an example of someone 
> intentionally including an empty-string URL in one of these tags.

Done.

Note that as a side-effect, <link rel=index href=""> is now 
non-conforming, although <a rel=index href=""></a> is still ok. I couldn't 
find a sane way to work around that.


On Mon, 7 Dec 2009, Aryeh Gregor wrote:
> 
> The same goes for a URL that consists only of a fragment.  In fact, a 
> quick test in the browsers I had handy (Firefox 3.5 and Opera 9.22) 
> suggests that there are more elaborate protections against recursion 
> here.  Try saving these two files in the same directory with the names 
> "test1.html" and "test2.html", and viewing test1.html in a web browser:
> 
> <!doctype html>
> <p>1</p>
> <iframe src=test2.html>
> 
> <!doctype html>
> <p>2</p>
> <iframe src=test1.html>
> 
> Neither browser I tested with has an infinite loop here, although they 
> terminate at different steps: Firefox displays each page only once 
> (visible text is 1 2), while Opera displays test1.html twice (1 2 1). Is 
> this covered by the spec anywhere?

This falls into the "hardware limitations" clause.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Monday, 8 March 2010 16:40:31 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:21 UTC