Re: same-document references

On Jun 23, 2011, at 9:24 AM, Boris Zbarsky wrote:
> 
> My concern is RFC 3986 section 4.4, which says:
> 
>   When a URI reference refers to a URI that is, aside from its fragment
>   component (if any), identical to the base URI (Section 5.1), that
>   reference is called a "same-document" reference.
> 
> and then says:
> 
>   When a same-document reference is dereferenced for a retrieval
>   action, the target of that reference is defined to be within the same
>   entity (representation, document, or message) as the reference
> 
> What that means is that it's impossible to implement section 4.4 by canonicalizing all URIs into absolute URIs.  All URIs used by a system have to remember whether they were "same-document" references (which you only know at URI creation time) and if so need to know what document they're associated with to be properly retrieved.

No.  In fact, what it means is the exact opposite of your conclusion
and I have no idea how you managed that.

It is specifically defined in terms of comparison of absolute (and
possibly canonicalized) URIs.  It is written that way specifically
because folks (including all of the major browser vendors at the time)
wanted fragment-only references to remain same-document, regardless if
the base URI is changed and *after* being transformed to absolute.

Julian pointed out the past issue tracker summary.  Here is some
of the discussion:

  http://lists.w3.org/Archives/Public/uri/2003Jun/0070.html

The reasons the text changed from 2396 to 3986 were because fragment
was returned to being "part of the URI" and because the prior
(Mar 1998) text was specific to HTML href handling.  The other
document formats that also use the URI standard had different
interpretations of what same-document meant.  I had to resolve the
issues accordingly, and so I resolved them in favor of the
then-existing browser implementations.

Unfortunately, the W3C archive search doesn't work for anything
prior to 1999.

> This is not interoperably implemented by UAs, last I checked, and you were asking about things that cause interop problems.

Then let's test it and find out what people implement.  I don't
like the idea of changing full standards based on vague recollections.
Please be specific with regard to what versions of browsers (or
other URI processors) have been tested.

On Jun 26, 2011, at 2:05 AM, Anne van Kesteren wrote:

> On Sat, 25 Jun 2011 21:48:19 +0200, Larry Masinter <masinter@adobe.com> wrote:
>> (sending HTML attachment with hopes it won't get mangled by mail distribution).
>> 
>> A same-document reference should work within HTML content that itself doesn't have a stable URI. That was the whole point of introducing the special case for same-document references.
> 
> And it would always work fine if the base URL was not set in the document. The question is whether it should still work if the base URL is different from the document URL and what the expected behavior is for all the contexts that accept links that are not <a>.

Indeed.  The process for making changes to an internet standard starts
with describing why the change is being requested, how the change will
impact known deployed practice (not just browsers), and an Errata
request in the form of a diff to the RFC.

....Roy

Received on Sunday, 3 July 2011 00:36:29 UTC