RE: XML Base and XPath absolutizing of URIs

Hi Jonathan,

-----Original Message-----
From: Jonathan Borden [mailto:jborden@mediaone.net]
Sent: Thursday, June 08, 2000 4:49 PM
To: John Boyer
Cc: xml-uri@w3.org
Subject: Re: XML Base and XPath absolutizing of URIs


John Boyer wrote:
> ... Since XPath is in violation of the
> namespaces spec anyway for trying to absolutize URIs, the feature should
be
> removed by an erratum.  Alternately, XPath could be modified by an erratum
> to indicate either that the base URL is provided by XBase or as an
> additional component of the evaluation context.
>
> One way or the other, something about XPath needs to be changed.

It this because the XPath function namespace-uri() is defined to return the
"expanded name" which is the absolutized URI?

<john>
Actually, no the namespace-uri() function return is not what is prompting my
statements.  Check out XPath Section 5, the Data Model and look at the
paragraph directly below the first 'NOTE:'.
</john>

Why is this a violation of the XML Namespace REC? All it means is that the
literal string namespace name is not always what the namespace-uri()
function returns.

<john>
As you can see from the aforementioned paragraph, it also means that the
string-value of a namespace node is the absolutized URI, not the string
literal namespace name.  That is what I think is in violation of the
namespaces REC.  It is alright to provide the absolute URI as peripheral
information, but I would prefer the string-value to be the literal string
read in from the document because that's what I would prefer to write back
out.
</john>

Under the current situation, as I see it, the infoset (and ought DOM2, no?)
'contains' both the baseURI and namespace name, from which an implementation
of the namespace-uri() function could construct an absolutized URI. (Perhaps
under the DOM2 definition the base URI is retrievable via the "xml:base"
attribute).

<john>
According to the latest infoset draft (20 Dec. 1999), the namespace
information item is not required to include the literal string appearing in
input.
</john>

The fact that XPath specifies URI absolutization may indeed affect XPath
processing, yet this does not invalidate the processing, indeed this is the
expected behavior.

<john>
I am defining a c14n spec based on XPath.  However, take XPath out of the
picture and consider c14n based solely on infoset.  The reason for this is
that this is not an XPath problem, it is an infoset problem. The XPath data
model is a subset of infoset that is sufficient for the purposes of c14n,
but if infoset is broken, then this naturally affects XPath too.

So, given a c14n based on an absolutizing infoset, here's what breaks:

Consider a document containing relative URIs in its namespace declarations.
Furthermore, suppose your application is capable of retaining the original
relative URIs in this document.  Furthermore, suppose you intend to have a
user digitally sign the document.  As part of this, you must canonicalize
the document so it can be fed to a digest algorithm, but suppose that your
c14n uses an infoset that absolutizes the relative namespace URIs.

The signed digest contains absolute URIs that may, depending on the base
URI, refer to physical files on your computer.  Now you send the signed
document to me.  Because your overarching application retained the relative
URIs, when I go to verify your signature, it breaks because, in the c14n
that I calculate, the relative URIs are now converted to absolutes based on
MY computer, so the digest I compute will not be equal to the one store in
the SignatureValue.

On the other hand, if the infoset used by c14n guaranteed relative URIs,
then the signature wouldn't break.  Moreover, if the document indicated by
the relative namespace URI does in fact have information value w.r.t.
interpreting the document containing the relative namespace URI, then the
signature should include the digest of the document indicated by the
relative namespace URI, which would cause the signature to break only if the
document at the indicated location actually changed.

In other words, if a document that qualifies a namespace if given by
relative namespace URI in a second document, what matters to interpreting
the second document is the namespace qualifying document, not the form of
its URI.  Absolutizing is therefore not solving the problem of nailing down
the additional data needed to interpret the problem because the data at the
absolute location could change.  Since it is also causing this problem, it
should be eliminated.  Alternately, dsig could recommend that relative
namespace URIs not be used in documents that are to be signed.

In conclusion, though, I prefer Xpath's choice to always absolutize versus
the current infoset's choice to leave it open to be either way.  Every
choice to leave it open to the underlying application-specific
implementation makes it that much harder to create a canonical form that
can, for example, be fed to a digest algorithm.

John Boyer
Software Development Manager
PureEdge Solutions Inc. (formerly UWI.Com)
Creating Binding E-Commerce
jboyer@PureEdge.com

</john>

Jonathan Borden

Received on Thursday, 8 June 2000 21:39:46 UTC