W3C home > Mailing lists > Public > public-iri@w3.org > July 2011

Re: parsing URI (references) according to RFC 3986

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Sun, 03 Jul 2011 20:53:35 +0200
To: Boris Zbarsky <bzbarsky@MIT.EDU>
Cc: public-iri@w3.org
Message-ID: <ds811797s766brue30c5gifkbis8ag0ogv@hive.bjoern.hoehrmann.de>
* Boris Zbarsky wrote:
>The problem is that per 4.4 as I understand it this HTML:
>
>   <!DOCTYPE html>
>   <base href="http://greenbytes.de/tech/tc/uris/imgother.html">
>   <img src="#foo">
>
>located at <http://greenbytes.de/tech/tc/uris/img.html> should treat the 
>image load as a load of <http://greenbytes.de/tech/tc/uris/img.html> 
>(because this is a same-document URI reference) whereas this is not what 
>any browser does.  Browsers treat it as a load of 
><http://greenbytes.de/tech/tc/uris/imgother.html>.

There is also the interpretation that the browser should render the foo
part of the current document as it is currently loaded in place of the
img element (that in turn might require already having rendered the img
element without a way to break the circular depencency). Let's try this:

  <!-- document at http://org.example.org -->
  <base href='http://com.example.com'>
  <p><a href='http://org.example.org#x'>...</a>
  <p><a href='http://com.example.com#x'>...</a>
  <p><a href='#x'>...</a>

If you click the second link so the browser dereferences the URL for a
retrieval action, and uses the address in the base element as base for
that purpose, then it would find this to be a same-document reference.
The first link is different from the base, so it's not a same-document
reference. The third link would be made absolute using the base element
so it's like the second link.

Browsers do it the other way around. The first link is a same-document
reference as it matches the document's address, the second does not, so
it goes elsewhere, and the third is resolved using the base element and
thus handled link the second link. So quite obviously there are two base
addresses that are considered, it's not like the base element changes
"the" base address.

For this particular example we could say that you decide whether a re-
ference is a same-document reference using the document's base address,
and the base element modifies something like each element's base and
when resolving something you first make references absolute with the
element's base and then resolve it using the document's base.

Main problem with such a model is that it breaks XInclude and similar
technologies, as I understand it anyway. There you have very much the
idea that you can make fragments that can both have same-document re-
ferences and also resource-relative references. Think SVG fragments:

  ...
  <defs>
    <linearGradient id='...' ...
  ...
  <rect fill='url(#...)' ...
  ...
  <image xlink:href='example.png' ...

Now you have a couple of those and use SVG+XInclude to put them all in
one file. XInclude attempts to address this by having a "xml:base fix-
up" step that adds xml:base attributes so "example.png" is found in the
right location, but, as I recall it anyway, that fixup is not meant to
affect the same-document reference in the fill attribute.

If you had

  <svg xml:base='...' ...
    ...
    <rect fill='url(#...)' ...

You would not want that to result in a request to the xml:base address.
With that in mind you get, for the initial <img> example, the interpre-
tation I offered above. As I said elsewhere, this is a conflict between
people who want easy copy and paste and people who want to shorten URIs.
You can't use the simple element base / document base distinction here
as you have more than one document you make same-document references to.

Where would that leave us then, though? What's desired is that we know
for each address that may be relative how to make it absolute and how to
tell, keeping that model, whether it's a same-document reference (maybe
we need to have same-fragment references instead?) and we want to spend
as little effort as possible on providing such definitions explicitly.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
Received on Sunday, 3 July 2011 18:54:09 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:14:42 UTC