xml:base as a PI

Recently I have been considering the so-called canonicalization problem, in
which an XML document is read and a new 'canonical' representative of the
document is created.

There is a bit of a problem with base URIs for content derived from external
entity references.  The 'canonical form' is a single XML document that does
not contain an external entity references.  The canonical form is the output
of the c14n algorithm.  If the canonical form is then read as input, the
content derived from external entity references in the originating document
is now simple internal content in the canonical form.  Thus, whatever base
URI is associated with the canonical form is associated with the internal
content.  This is likely to be inappropriate.

The dsig group recently opted to make no changes to address this problem, in
part because we have no solutions that are free from problems, and we are
more concerned with making sure no unauthorized changes were made to the
originating document, so we do not care as much about whether the canonical
form is actually operational.

However, from the c14n purist perspective, it would be helpful if some
solution to this problem existed.  I believe that part of the solution is to
provide xml:base as a PI, or rather as two PIs.  I'm sure there will be
technical details that will need to be worked out, but hopefully there are
no impossibly tall buildings to leap.

Those wishing to create reusable external entities could consider wrapping
the entity content in a pair of xml:base PIs, e.g.

<?xml-base-start uri="http://www.w3.org"?>

the content

<?xml-base-end?>

In the long run, I think that infoset could reflect these PIs even if they
were not declared explicitly.  This would be far superior, but naturally
implies that this PI should be able to nest.

Finally, once infoset reflected the proper xml base, an XPath based on this
infoset would have to come out, at which point c14n would work a whole lot
better.


     John Boyer
      Development Team Leader,
      Distributed Processing and XML
      PureEdge Solutions Inc.
      Creating Binding E-Commerce
      v: 250-479-8334, ext. 143  f: 250-479-3772
      1-888-517-2675   http://www.PureEdge.com

Received on Thursday, 10 August 2000 20:18:36 UTC