Re: xml:base irrelevant to same document reference

The problematic "same document reference" wording from is proposed to be 
revised in the forthcoming revision of RFC2396 [1] [2].  With this change, 
I think that using xml:base is entirely consistent with RFC2396.

I have highlighted the relevant changes in citations [3][4].  Note in 
particular that the following:
[[
                 -- An empty reference refers to the current document
                 return (current-document, fragment);
]]
no longer appears in the description of the algorithm for mapping a 
relative URI to a full URI.

#g
--

[1] http://www.apache.org/~fielding/uri/rev-2002/rfc2396bis.html

[2] http://www.apache.org/~fielding/uri/rev-2002/issues.html#017-rdf-fragment
specifically:
[[
action: Roy T. Fielding, 23 May 2003, draft 02:

Removed the special-case treatment of same-document references in favor of a
section that explains that a new retrieval action should not be made if the
target URI and base URI, excluding fragments, match.
]]

[3] Changes to same-document reference text:
Old text:
[[
4.2  Same-document References

A URI reference that does not contain a URI is a reference to the current 
document. In other words, an empty URI reference within a document is 
interpreted as a reference to the start of that document, and a reference 
containing only a fragment identifier is a reference to the identified 
fragment of that document. Traversal of such a reference should not result 
in an additional retrieval action. However, if the URI reference occurs in 
a context that is always intended to result in a new request, as in the 
case of HTML's FORM element, then an empty URI reference represents the 
base URI of the current document and should be replaced by that URI when 
transformed into a request.
]]
-- 
http://www.apache.org/~fielding/uri/rev-2002/draft-fielding-uri-rfc2396bis-00.html#rfc.section.4.2

New text:
[[
4.4  Same-document Reference

When a URI reference occurring within a document or message refers to a URI 
that is, aside from its fragment component (if any), identical to the base 
URI (section 5.1), that reference is called a "same-document" reference. 
The most frequent examples of same-document references are relative 
references that are empty or include only the number-sign ("#") separator 
followed by a fragment identifier.

When a same-document reference is dereferenced for the purpose of a 
retrieval action, the target of that reference is defined to be within that 
current document or message; the dereference should not result in a new 
retrieval.
]]

[4] Changes to URI resolution test:
Old text:
[[
  For each URI reference (R), the following pseudocode describes an 
algorithm for transforming R into its target (T), which is either an 
absolute URI or the current document, and R's optional fragment:

    (R.scheme, R.authority, R.path, R.query, fragment) = parse(R);
       -- The URI reference is parsed into the four components and
       -- fragment identifier, as described in Section 4.3.

    if ((not validating) and (R.scheme == Base.scheme)) then
       -- A non-validating parser may ignore a scheme in the
       -- reference if it is identical to the base URI's scheme.
       undefine(R.scheme);
    endif;

    if defined(R.scheme) then
       T.scheme    = R.scheme;
       T.authority = R.authority;
       T.path      = R.path;
       T.query     = R.query;
    else
       if defined(R.authority) then
          T.authority = R.authority;
          T.path      = R.path;
          T.query     = R.query;
       else
          if (R.path == "") then
             if defined(R.query) then
                T.path  = Base.path;
                T.query = R.query;
             else
                -- An empty reference refers to the current document
                return (current-document, fragment);
             endif;
          else
             if (R.path starts-with "/") then
                T.path = R.path;
             else
                T.path = merge(Base.path, R.path);
             endif;
             T.query = R.query;
          endif;
          T.authority = Base.authority;
       endif;
       T.scheme = Base.scheme;
    endif;

    return (T, fragment);
]]
-- 
http://www.apache.org/~fielding/uri/rev-2002/draft-fielding-uri-rfc2396bis-00.html#rfc.section.5.2

New text:
[[
For each URI reference (R), the following pseudocode describes an algorithm 
for transforming R into its target URI (T):

    -- The URI reference is parsed into the five URI components
    --
    (R.scheme, R.authority, R.path, R.query, R.fragment) = parse(R);

    -- A non-strict parser may ignore a scheme in the reference
    -- if it is identical to the base URI's scheme.
    --
    if ((not strict) and (R.scheme == Base.scheme)) then
       undefine(R.scheme);
    endif;

    if defined(R.scheme) then
       T.scheme    = R.scheme;
       T.authority = R.authority;
       T.path      = remove_dot_segments(R.path);
       T.query     = R.query;
    else
       if defined(R.authority) then
          T.authority = R.authority;
          T.path      = remove_dot_segments(R.path);
          T.query     = R.query;
       else
          if (R.path == "") then
             T.path = Base.path;
             if defined(R.query) then
                T.query = R.query;
             else
                T.query = Base.query;
             endif;
          else
             if (R.path starts-with "/") then
                T.path = remove_dot_segments(R.path);
             else
                T.path = merge(Base.path, R.path);
                T.path = remove_dot_segments(T.path);
             endif;
             T.query = R.query;
          endif;
          T.authority = Base.authority;
       endif;
       T.scheme = Base.scheme;
    endif;

    T.fragment = R.fragment;
]]
-- 
http://www.apache.org/~fielding/uri/rev-2002/draft-fielding-uri-rfc2396bis-03.html#absolutize


At 08:50 15/06/03 +0700, James Clark wrote:

>Section 2.14 of the syntax spec says that xml:base is used in resolving a 
>URI reference such as "#snack".  I believe this in violation of RFC 2396. 
>A URI reference such as "#snack" is not in fact a relative URI reference 
>(see the BNF in section 11 of RFC 2396); rather it is a "same document 
>reference". See RFC 2396 section 4.2. This says that a URI reference 
>containing only a fragment identifier is a reference to the identified 
>fragment of the *current* document (i.e. the document in which the 
>fragment identifier occurs). I believe that for consistency with RFC 2396 
>xml:base should not apply to same document references, and also should not 
>apply to rdf:ID.  It seems totally bizarre to me that xml:base should 
>apply to rdf:ID when rdf:ID is not a relative URI.  It also seems bizarre 
>that rdf:ID should define a fragment identifier in a resource other than 
>the one in which it occurs: surely a resource only gets to define its own 
>fragment identifiers.
>
>James

-------------------
Graham Klyne
<GK@NineByNine.org>
PGP: 0FAA 69FF C083 000B A2E9  A131 01B9 1C7A DBCA CB5E

Received on Sunday, 15 June 2003 12:32:03 UTC