Re: parsing URI (references) according to RFC 3986

On 6/23/11 1:23 PM, Julian Reschke wrote:
> "When a same-document reference is dereferenced for a retrieval action,
> the target of that reference is defined to be within the same entity
> (representation, document, or message) as the reference; therefore, a
> dereference should not result in a new retrieval action."
>
> Test case: <http://greenbytes.de/tech/tc/uris/imgsame.html>
>
> You are right, Firefox does something bizarre; it indeed sends a GET
> request for http://greenbytes.de/tech/tc/uris/imgsame.html#foo (note
> that fragment identifier in the request URI).

Can't reproduce that part; I have no idea how you got that result (e.g. 
which exact tool you used to record what Firefox sends, nor which 
Firefox version you used).  In my case, using Firefox 4, the image is 
just read from cache.  But that's not the point.

> Now is the concern the fragment identifier (which seems to be a specific
> problem of Firefox), or the fact that browsers do a refetch?

Your testcase is not testing what really needs testing here.

What needs testing is a document at 
<http://greenbytes.de/tech/tc/uris/imgsame.html> with a <base 
href="http://greenbytes.de/tech/tc/uris/imgother.html"> and an <img 
src="#foo">.

Per section 4.4 as I read it, and per your comments just now, that 
should load http://greenbytes.de/tech/tc/uris/imgsame.html as an image. 
  Does anyone do that?

> Or, leave it in (I'm pretty sure there's a good reason for it), but
> state that there may be cases where the context may require refetching
> the resource.

See above; the refetching is not the issue.

> (we do agree that the advice holds for <a> links inside HTML, right?)

Do we?

Consider the following HTML documents:

main.html:

   <!DOCTYPE html>
   <iframe src="subframe1.html"></iframe>
   <iframe name="subframe2" src="subframe2.html"></iframe>

subframe1.html:

   <!DOCTYPE html>
   <base href="subframe2.html">
   <a href="subframe2.html#foo" target="subframe2">Click me</a>

subframe2.html:

   <!DOCTYPE html>
   <base href="something-else.html">
   <script>alert('loaded')</script>
   Will I scroll?
   <div style="height: 5000px"></div>
   <div id="foo">Can you see me now?</div>

Now the user clicks that link in the first iframe?  What happens?  How 
is one supposed to even apply 4.4 here?  Which URI is "the base URI"? 
Which entity is the reference "in"?  How can the URI spec even define this?

For what it's worth, browsers seem to interoperably scroll to the <div 
id="foo"> without triggering a new alert, but I can't see any 
justification in the spec for this.  And if I change subframe1.html as 
follows:

   <!DOCTYPE html>
   <base href="subframe2.html">
   <a href="subframe2.html#foo"
      onclick="parent.subframe2.location = this.href; return false;">
     Click me
   </a>

I get the same behavior.  And if I do this:

   <!DOCTYPE html>
   <base href="subframe2.html">
   <a href="subframe2.html#foo"
      onclick="parent.subframe2.location = 'subframe2.html#foo'; return 
false;">
     Click me
   </a>

I still get the same behavior.

On the other hand, if I create an HTML document like this (located at 
"test.html" at some path):

   <!DOCTYPE html>
   <base href="not-test.html">
   <script>alert('loaded')</script>
   <a href="#foo">Will I scroll?</a>
   <div style="height: 5000px"></div>
   <div id="foo">Can you see me now?</div>

then in all browsers clicking "Will I scroll" loads not-test.html 
instead of scrolling.

So as far as I can tell, section 4.4 is more or less ignored by browsers.

-Boris

Received on Thursday, 23 June 2011 18:05:42 UTC