W3C home > Mailing lists > Public > public-xml-core-wg@w3.org > May 2006

Re: Canonicalization xml:base processing

From: Richard Tobin <richard@inf.ed.ac.uk>
Date: Tue, 30 May 2006 17:56:09 +0100 (BST)
To: Konrad Lanz <Konrad.Lanz@iaik.tugraz.at>, Richard Tobin <richard@inf.ed.ac.uk>
Cc: "Grosso, Paul" <pgrosso@ptc.com>, public-xml-core-wg@w3.org
Message-Id: <20060530165609.CA3876D6CB4@macintosh.inf.ed.ac.uk>

I started looking at this and realised that I was likely to make
mistakes just in determining what the base URIs should be in the usual
case where the absolute base URI is available, so I thought I would
check we agree.

Here is your example:

>   <a xml:base="one/two">
>     <b xml:base="//three/four/./five/./../file.xsd">
>       <c xml:base="a.file"/>
>       <d>
>         <e xml:base="#bare-name">
>           <f xml:base=""/>
>           <f1/>
>         </e>
>         <g xml:base="//six/"/>
>       </d>
>       <h xml:base="http://www.iaik.tugraz.at">
>         <i xml:base="/aboutus/people/index.php">
>           <j xml:base="lanz/index.php">
>         </i>
>       </h>
>     </b>
>   </a>

Here it is with the base URIs added as an attribute on each element,
assuming that the document's URI is file:///tmp/foo.xml:

<a base-uri="file:///tmp/one/two" xml:base="one/two">
    <b base-uri="file://three/four/./five/./../file.xsd" 
       xml:base="//three/four/./five/./../file.xsd">
      <c base-uri="file://three/four/a.file"
         xml:base="a.file"/>
      <d base-uri="file://three/four/./five/./../file.xsd">
        <e base-uri="file://three/four/#bare-name"
	   xml:base="#bare-name">
          <f base-uri="file://three/four/"
	     xml:base=""/>
          <f1 base-uri="file://three/four/#bare-name"/>
        </e>
        <g base-uri="file://six/"
	   xml:base="//six/"/>
      </d>
      <h base-uri="http://www.iaik.tugraz.at"
         xml:base="http://www.iaik.tugraz.at">
        <i base-uri="http://www.iaik.tugraz.at/aboutus/people/index.php"
	   xml:base="/aboutus/people/index.php">
          <j base-uri="http://www.iaik.tugraz.at/aboutus/people/lanz/index.php"
	     xml:base="lanz/index.php"/>
        </i>
      </h>
    </b>
  </a>

I was surprised that <b> still has the . and .., but that is right
according to RFC2396: because the path is absolute, the algorithm is
not applied to it.  The interpretation of xml:base="" and
xml:base="#fragment" are still in doubt.

Do you agree with the above base URIs?

-- Richard
Received on Tuesday, 30 May 2006 16:56:34 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:21:34 GMT