same-document references

Hi there,

Boris and I took our conversation yesterday about same-document 
references offline, and I *think* we came to some common conclusions. 
Note, however, this write-up is entirely mine.

The text we are discussing is in 
<http://greenbytes.de/tech/webdav/rfc3986.html#rfc.section.4.4>:

-- snip --
4.4. Same-Document Reference

When a URI reference refers to a URI that is, aside from its fragment 
component (if any), identical to the base URI (Section 5.1), that 
reference is called a "same-document" reference. The most frequent 
examples of same-document references are relative references that are 
empty or include only the number sign ("#") separator followed by a 
fragment identifier.

When a same-document reference is dereferenced for a retrieval action, 
the target of that reference is defined to be within the same entity 
(representation, document, or message) as the reference; therefore, a 
dereference should not result in a new retrieval action.

Normalization of the base and target URIs prior to their comparison, as 
described in Sections 6.2.2 and 6.2.3, is allowed but rarely performed 
in practice. Normalization may increase the set of same-document 
references, which may be of benefit to some caching applications. As 
such, reference authors should not assume that a slightly different, 
though equivalent, reference URI will (or will not) be interpreted as a 
same-document reference by any given application.
-- snip --

Observations:

a) Some readers and implementations get this wrong

People seem to have a hard time understanding how the text in 4.4 (and 
the referenced 5.1 about base URIs) applies to certain media types, such 
as HTML (<base> element) or SVG (xml:base attribute).

It would probably be helpful if definitions of media types made it clear 
how to compute the base URI for any given reference in the document, and 
also how changes to the document after load affect that (for instance, 
when the DOM is modified).

Boris brought up two examples where WebKit gets this wrong:

i) the base URI is properly computed for references like a/@hrefor 
img/@src, but not for SVG fill styles.

ii) it appears that when processing references from SVG content, Webkit 
*only* inspects the fragment identifier and completely ignores the rest 
of the reference.

(I believe Boris has raised WebKit bug reports for those)

So these are issues with the quality of implementations, but they might 
be caused by the media type definitions not being clear enough about 
what's supposed to happen.


b) The second paragraph of 4.4 can be read as if it's not necessary to 
retrieve the referenced resource

"When a same-document reference is dereferenced for a retrieval action, 
the target of that reference is defined to be within the same entity 
(representation, document, or message) as the reference; therefore, a 
dereference should not result in a new retrieval action."

This is nice for cases where the URI of the entity actually is the same 
as the base URI. However, if the base URI was actually *changed*, a new 
retrieval operation *is* necessary; at least, this seems to be what HTML 
UAs do here (note that even if the retrieved entity is the same 
octet-for-octet, this will affect DOM changes (being discarded) and 
observable behavior in the UA (as in events being fired)).

I'm not entirely sure whether section 4.4 tries to rule out this 
behavior, but if it does, that appears to be indeed a problem. May be 
this needs to be phrased as "may skip a new retrieval action", instead 
of "should not result in..."?

Best regards, Julian

Received on Friday, 24 June 2011 15:43:53 UTC