- From: Nicholas Zakas <nzakas@yahoo-inc.com>
- Date: Mon, 7 Dec 2009 12:05:08 -0800
Thanks for the references, this helps my understanding a lot. The reason I think this is important is because the "just fetch the resource again" behavior is inherently destructive and unexpected. When one of these appears on a page, page views double. This isn't a problem if it's your personal blog, but for high-volume web sites such as Yahoo!, Google, and Facebook, a 100% increase in traffic causes a lot of problems. From conversations with engineers at other companies, it seems that we've all fallen victim to this behavior at one time or another. I think one would argue that <img src=""> is unlikely markup as well, yet the spec currently provides guidance around this case. Wouldn't it make sense to be consistent across tags that act in a similar fashion? -Nicholas ______________________________________________ Commander Lock: "Damnit Morpheus, not everyone believes what you believe!" Morpheus: "My beliefs do not require them to." -----Original Message----- From: simetrical@gmail.com [mailto:simetrical@gmail.com] On Behalf Of Aryeh Gregor Sent: Monday, December 07, 2009 11:44 AM To: Nicholas Zakas Cc: whatwg at lists.whatwg.org Subject: Re: [whatwg] Inconsistent behavior for empty-string URLs On Mon, Dec 7, 2009 at 1:51 PM, Nicholas Zakas <nzakas at yahoo-inc.com> wrote: > Presently, HTML5 does provide guidance on the correct behavior for <img > src=""> in section 4.8.2, indicating that Firefox 3.5's and Opera 10's > behavior in this regard is correct: > > "If the base URI of the element is the same as the document's address, then > the src attribute's value must not be the empty string." That says that if it's the empty string, the document is invalid. It doesn't say what the UA has to do. The relevant part is: [[ Unless . . . the element's src attribute has a value that is an ignored self-reference, then, when an img is created with a src attribute, and whenever the src attribute is set subsequently, the user agent must resolve the value of that attribute, relative to the element, and if that is successful must then fetch that resource. . . . The src attribute's value is an ignored self-reference if its value is the empty string, and the base URI of the element is the same as the document's address. ]] This implies user agents don't need to resolve the src or fetch the element if the src is empty (unless the base URI is non-default). I don't think they're prohibited from doing so, since there's no detectable difference to their user-visible output -- likewise they might fetch resources speculatively even if not explicitly required to. It's kind of pointless, though. The other cases seem to make no specific exception for an empty URL, so as far as I can tell, the UA must fetch them as usual -- although of course it might have a valid copy in the cache. This is clearly not a good idea for <iframe>, since otherwise <iframe src=""> is an instant infinite loop on a typical page. The same goes for a URL that consists only of a fragment. In fact, a quick test in the browsers I had handy (Firefox 3.5 and Opera 9.22) suggests that there are more elaborate protections against recursion here. Try saving these two files in the same directory with the names "test1.html" and "test2.html", and viewing test1.html in a web browser: <!doctype html> <p>1</p> <iframe src=test2.html> <!doctype html> <p>2</p> <iframe src=test1.html> Neither browser I tested with has an infinite loop here, although they terminate at different steps: Firefox displays each page only once (visible text is 1 2), while Opera displays test1.html twice (1 2 1). Is this covered by the spec anywhere? I'm not sure it makes a difference whether <script src=""></script> or <link rel=stylesheet href=""> does anything special. It seems simpler to just leave them as-is, so they fetch the resource again (or retrieve it from cache if possible) and then probably throw it out as invalid (since it's HTML and not CSS/JS/etc.). > I'm interested in what others' opinions on this may be, as this seems like > an important area in which to gain consistency. Why? It seems like fairly unlikely markup. Consistency is good, but I wouldn't call this point "important".
Received on Monday, 7 December 2009 12:05:08 UTC