Difficulties with URI="" and IDREF

It would seem that without canonicalization, the interpretation of IDREF and
of URI="" can change from application to application.

When an application reads an XML document, it will store different material
based on whether or not the XML processor is validating, whether it
understands namespaces, and how it decides to store attributes (whose order
is application-dependent).

When one uses an IDREF, I suppose this immediately implies a validating
processor since ID's aren't supposed to work for non-validating processors.
However, the other problems still exist.  Within the identified element, how
should the text be rendered for the digest algorithm?  One change of
attribute order destroys a signature.

Also, IDREF is usually used in conjunction with URI="".  URI="" is supposed
to indicate the root of *this* document, but there is still not enough
information to tell us how to generate the byte stream that will be
digested.  Fortunately, URI="" cannot be used alone since such a signature
would break as soon as the signature value is added to the document.  URI=""
must be used in conjunction with either IDREF or an XPath transform.

The XPath transform has already been fixed by requiring c14n as part of its
processing.  IDREF must either be similarly fixed, or simply throw out IDREF
as being a redundant way of doing a fairly simple XPath transform.

Finally, note that URI="" could be used in conjunction with a transform
other than XPath, such as XSLT.  The XSLT transform has not been modified
yet to require canonicalization.  Thus, URI="" may feed slightly different
data to an XSLT depending on the application.  We need to fix this.

John Boyer
Software Development Manager
UWI.Com -- The Internet Forms Company

Received on Monday, 3 January 2000 13:54:47 UTC