- From: Jonas Sicking <jonas@sicking.cc>
- Date: Tue, 01 May 2007 17:08:45 -0700
Ian Hickson wrote: > On Sun, 11 Feb 2007, Geoffrey Sneddon wrote: >> Safari 2.0.4/419.3: (1) Inserted in DOM (in the innerHTML location). >> Firefox 2.0.0.1: (3) Inserted in DOM (in the innerHTML location). >> IE/Mac 5.2.3: (2) (anyway to view the DOM tree?) >> Opera 9.10: (1) DOM Snapshot for some reason isn't working. >> IE6/Win: (2) The new <base> never appears in DOM, but the full absolute URLs >> are in the DOM. >> IE7/Win: (3) The new <base> never appears in DOM, but the full absolute URLs >> are in the DOM. >> >> In conclusion, Safari and Opera change all the links, IE5/Mac and >> IE6/Win both change links within the fragment, and Firefox and IE7/Win >> don't change any links. > > The latter is the option I'm following for now. Note that browsers all do > _different_ things for target="" than for href="". The spec has made them > act the same for now. I'm not sure this is workable, we'll have to see > when the browser vendors try to get this interoperable. I can't imagine > that it's a huge issue given that the browsers are so far from each other > in terms of what they do here. I'm going to do a study of some subset of > the Web to see how common this is (at least the static case; I can't > really do much about the scripted case). I don't think this is a good solution actually. In general, I think it's good to always make the DOM reflect the behavior of the document. I.e. it shouldn't matter how you arrived to a specific DOM, be it through parsing of an incoming HTML stream, or by using DOM-Core calls. Whenever we make an exception for that rule I think we need to have a good reason for it. For quirky <base> behavior it is my experience that what matters most is what URI things in a static page is resolved against. Most modern pages that uses scripting and DOM and such usually only has zero or one <base> element that lives in the head. What I suggest is that we make the first or last <base> element in the <head> be the one that sets both the base target and the base href for the document (modulo all special handling needed when <base>s appear in the body, described below). While this is not what IE or Firefox does today, I doubt that it'll break enough pages to stray from the act-like-the-DOM-looks principal. Currently mozilla uses the last <base> that appears in <head>. There doesn't appear to be a reason for using the last rather than the first, it's just what we've always done. However it would be interesting to know what IE uses here since it might matter. Did safari or opera run into any issues here? One thing we unfortunately will have to deal with is <base> elements appearing in the middle of the body of the document. What mozilla had to do was once we find a <base> element in the body of the document, we tell the parser to remember the resolved href and/or target of that <base> element. We then for any element that uses base uris (full list at [1]) set an internal member in the element that hardcodes the elements base uri and/or base target. For elements that don't get this property set on them base href and target resolution works as normal. For elements that has this set base href and target resolution only uses the set properties. Note that you only set the saved href and target in the parser if the attribute is set in the <base> element. So if a document contains <base target="foo"> in the middle of the body that does not set a saved href in the parser. This algorithm is something we had to add to firefox in order to support many pages out there. I think IE7 changed how they delt with this, though I don't know the specifics of how it changed. Would be interesting to get their feedback on this. [1]http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/content/html/document/src/nsHTMLContentSink.cpp&rev=3.787#799 > On Tue, 10 Apr 2007, Jonas Sicking wrote: >> Note that the current text isn't implementable since it says that >> relative uris in <base> should be resolved against the base uri >> document, but the <base> element modifies that base uri so there is a >> circular dependency. > > No, the <base> element sets the "document entity's base URI", and is > resolved relative to the "base URI from the encapsulating entity" or the > "URI used to retrieve the entity". See RFC2396. Ah, the "base" part of "base URI from the encapsulating entity" confused me. Any chance we can remove that or is that the language RFC2396 uses? / Jonas
Received on Tuesday, 1 May 2007 17:08:45 UTC