Possible solution for XML Base problem from John Boyer on 2000-07-28 (www-xml-linking-comments@w3.org from July to September 2000)

From: John Boyer <jboyer@PureEdge.com>
Date: Fri, 28 Jul 2000 13:04:53 -0700
To: "Kevin Regan" <kevinr@valicert.com>, "Jonathan Marsh" <jmarsh@microsoft.com>, <www-xml-linking-comments@w3.org>
Cc: <w3c-ietf-xmldsig@w3.org>
Message-ID: <BFEDKCINEPLBDLODCODKEEAICEAA.jboyer@PureEdge.com>
Thanks Kevin.

I had suspected that it is impossible for most XML processors to distinguish
content obtained from external entities due to section 4.4.2 of XML 1.0.

However, I've also given some additional thought to this whole problem of
having c14n change the base URL when it pulls external entities into the
document, and I now feel that I need further convincing that this is a bad
thing.

I think that c14n does the right thing with respect to the XML 1.0 default
base URI.  Moreover, this consideration leads me to believe that
preservation of xml:base in canonicalized documents is faulty.  Obviously
this is not good for XML Base.

It seems to me that relative URIs related to XML syntax seem to get used and
discarded (e.g. SystemLiteral) or replaced by absolute URIs (e.g. XPath
handling of namespaces).  As long as XPath and hence XSLT implementations
respect the XML 1.0 requirement to given a different base URI to content
derived from external entities, then everything should work fine. Based on
Kevin's feedback, I suspect there may be implementations that do not
conform, but at least the specifications are consistent so hopefully the
implementations will be fixed.

As for content that is defined to be a relative URI by an application, the
content is treated like opaque character data.  No generic XML specification
(like C14N or XSLT) can possibly know that the data is supposed to be a
relative URI.  Therefore, if the generic XML application (e.g. C14N) does
something like generate internal content from an external entity reference,
then the fact that the default base URI for that internal content is now the
new document is actually the correct thing to do.  In the case of C14N, if
the output is consumed by the application that can process the canonicalized
XML, the base URI is in fact set properly for that application to be able to
use its relative URIs.

The only place where we run into trouble is with the actual use of xml:base.
If I have an element E containing a relative URI that was formerly obtained
by replacing an external entity reference, and if the replacement occurs
within an element A that has an xml:base, then the xml:base will apply to E.

The xml:base supposedly indicates the original source document.  If it does
not, then it is incorrectly set.  Once the document has been canonicalized,
though, the source document is the canonical form, so the xml:base should be
removed until a permanent home for the canonical form is found.

In short, XML base is used for input, whereas c14n is about output.  Please
give some thought to a counterexample to the idea of omitting xml:base from
the c14n output.

Thanks,
John Boyer
Development Team Leader,
Distributed Processing and XML
PureEdge Solutions Inc.
Creating Binding E-Commerce
v: 250-479-8334, ext. 143  f: 250-479-3772
1-888-517-2675   http://www.PureEdge.com <http://www.pureedge.com/>




-----Original Message-----
From: w3c-ietf-xmldsig-request@w3.org
[mailto:w3c-ietf-xmldsig-request@w3.org]On Behalf Of Kevin Regan
Sent: Thursday, July 27, 2000 2:31 PM
To: John Boyer; Jonathan Marsh; www-xml-linking-comments@w3.org
Cc: w3c-ietf-xmldsig@w3.org
Subject: RE: DSig comments on XML Base



I have no access to the XML processor.  My library receives DOM Document
and Element objects when creating the signature.  When verifying the
Signature a deligate to the user to find and parse the URI.  I'm not
sure I totally understand the discussion that had taken place, but
I would say that I have no way of distinguishing which parts of
the source came from external entities, and forcing the user to
structure the DOM subtree that I am signing in a particular way is
a big no no.  Currently, the only thing that my library requires
is that the user hand me a DOM Document or Element that has been
parsed with a validating parser (with ignorable whitespace not
included).

--Kevin

-----Original Message-----
From: John Boyer [mailto:jboyer@PureEdge.com]
Sent: Thursday, July 27, 2000 1:59 PM
To: Jonathan Marsh; www-xml-linking-comments@w3.org
Cc: w3c-ietf-xmldsig@w3.org
Subject: RE: DSig comments on XML Base


Hi Jonathan,

OK, that makes some sense.  What you're saying is that we should have
c14n
extend the Xpath data model by adding an xml:base to the top level
element
of external entities.

This must be done by modifying the XML processor that generates the
node-set.  I wonder how easy this is for implementers.

I agree with you that trying to read between the lines on XML 1.0 is a
waste
of time, but I disagree with the implication that this is what I'm
doing.
There are quite specific lines that tell an XML processor developer that
they need not distinguish between content derived within the document
versus
content derived externally.

So, TAMURA Kent, Kevin Regan and others: could you please let us know if
you
can do this?  If so, then I'd like to do what you suggest Jonathan, then
place a note about the residual problem with base URI for top-level PIs.

Thanks,
John Boyer
Development Team Leader,
Distributed Processing and XML
PureEdge Solutions Inc.
Creating Binding E-Commerce
v: 250-479-8334, ext. 143  f: 250-479-3772
1-888-517-2675   http://www.PureEdge.com <http://www.pureedge.com/>
Received on Friday, 28 July 2000 16:05:03 UTC