[whatwg] Document's base URI should use the document's *current* address from Ian Hickson on 2012-02-15 (public-whatwg-archive@w3.org from February 2012)

From: Ian Hickson <ian@hixie.ch>
Date: Wed, 15 Feb 2012 20:50:28 +0000 (UTC)
Message-ID: <Pine.LNX.4.64.1202151957350.11170@ps20323.dreamhostps.com>
On Wed, 20 Jul 2011, Justin Lebar wrote:
> >
> > The spec as written decides whether a link is a same-resource 
> > reference or not based on comparing the URLs to what you're calling 
> > the original address, not comparing it to the current address. See the 
> > navigation algorithm, step 7 /Fragment identifiers/.
> 
> Maybe I'm misunderstanding, but this might not be the case in the 
> history traversal algorithm.

In history traversal, the URLs compared are those of the entries involved. 
However, clicking a link is primarily navigation, not session history 
traversal (though it can involve the latter).


> > Step 6: If the specified entry has a URL whose fragment identifier 
> > differs from that of the current entry's when compared in a 
> > case-sensitive manner, and the two share the same Document object, 
> > then let hash changed be true.
> 
> It's not clear to me what the current/specified entry's URL is, or where 
> this is properly defined, but earlier, we say:

Hm, yes, the spec doesn't quite clearly define the URL in all cases. 
Fixed.


> > The current entry is usually an entry for the location of the 
> > Document.

That's a non-normative statement. I've made it more explicitly so.


> and the document's location changes when we call push/replaceState.

The current entry is whatever the algorithms last set the current entry 
to. I've made that clearer in the spec.


> >> As currently specified, we'll resolve #foo relative to the document's 
> >> original URL; that is, clicking the link will take the user to 
> >> page.html#foo, not page2.html#foo.  But the intent of a link with 
> >> href #foo is clearly to navigate within the current page, not to go 
> >> somewhere else.
> 
> Were you saying that this isn't the right interpretation of the spec? 
> Because #foo is resolved relative to the document's base URI, which is 
> the same as the document's original URI, so we decide that #foo is a 
> same-document link?  That's comforting, if it's true.  :)

When you click a link to "#foo" on a document whose "current address" is 
page2.html but whose "document's address" is "page.html", then you go 
through these steps:

 - Start the "Follow a hyperlink" algorithm.
 - "Resolve" href relative to the <a> element.
 - This uses XML Base, with the fallback base url being "the document's 
   address", which is what you were calling "the original URL".
 - This results in ".../page.html#foo".
 - "Navigate" to that URL.
 - Step "Fragment identifiers" then compares this URL to "the document's 
   address" (page.html, not page2.html), and finds a match.
 - "Navigating to a fragment identifier" is invoked and creates a new 
   session history entry with the URL "page.html#foo".
 - "Traverse the history" is then invoked.
 - It sets "the document's current address" to ".../page.html#foo".
 - Scrolling happens.
 - The "current entry"'s URL is "../page2.html" and the specified entry's 
   URL is ".../page.html#foo" so the fragids differ and hashchange fires.
 - The "current entry" becomes the new specified entry.


> > Note that there are problems with what you describe: what if the new 
> > URL has a different path, and there are <img> elements whose URLs are 
> > relative, and after pushState() you clone one? Or what about relative 
> > links in the original markup? I don't think we can change the base URL 
> > on the fly, all kinds of problems could result.
> 
> I agree there are problems with changing the base URI.  But it seems 
> much less intuitive for common use-cases not to change it.  We can 
> change my example above to use ?foo instead of #foo, and I think the 
> same argument applies.  Should a link with href ?foo always resolve 
> relative to the document's original URI (unless the base is explicitly 
> changed)?

Yes, I'd say so. Otherwise cloning images would break.


> Similarly, if for some bizarre reason the page pushState's to a new 
> directory, shouldn't all the links point relative to that new directory?

That would break all existing images, stylesheets, scripts, etc, if their 
URLs are reused somehow.


> I kind of think this ship has sailed wrt implementations.  Chrome and 
> Firefox both have the same behavior in this respect.  See 
> http://people.mozilla.org/~jlebar/whatwg/test_pushstate_resolve.html 
> (source included below, since I have a bad habit of deleting these test 
> files right before someone else wants to look at them).
> 
> Ian, how hard do you think it would be to spec changing the base and 
> resolve the issues with that?

Changing the base URL would be trivial, but I think it would cause all 
kinds of bad things and isn't what we should do. Consider:

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/1342

It doesn't make sense that the second image is broken.

(For some reason in Firefox I get an exception. Not sure if I'm misusing 
the API or if it's a bug in Firefox.)

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 15 February 2012 12:50:28 UTC