- From: Elliotte Harold <elharo@metalab.unc.edu>
- Date: Mon, 24 Jan 2005 11:51:37 -0500
- To: public-xml-id@w3.org
I just noticed a major conceptual mismatch between canonical XML and xml:id. The problem occurs when calculating the canonical form of a document subset. The issue is that each nearest attribute in the XML namespace are added to elements from the subset if the original ancestor elements that provided those attributes are not present. For instance, consider this document: <root xml:id="p1"> <child /> </root> Now suppose we canonicalize this document with the XPath expression //child to select a subset. Then resulting canonical form is: <child xml:id="p1"></child> Worse yet, suppose we start with this input document and use the same XPath expression: <root xml:id="p1"> <child /> <child /> <child /> </root> What comes out is: <child xml:id="p1"></child> <child xml:id="p1"></child> <child xml:id="p1"></child> Duplicate IDs! I think the canonical XML spec clearly intended that all attributes in the XML namespace have scope over their descendants, but that's not really true for xml:id. This probably has downstream implications for XML digital signatures and XML encryption, both of which depend on canonicalization. Exclusive XML canonicalization does not inherit xml: attributes, and so does not have this problem. I am not sure what to suggest as a fix. It is still possible to canonicalize a document that uses xml:id. However, the results could be quite unexpected and perhaps dangerous. I wish I had a good answer here. I don't. I do think this should be discussed, and whatever resolution is reached needs to be called out in the spec to warn people about this. -- Elliotte Rusty Harold elharo@metalab.unc.edu XML in a Nutshell 3rd Edition Just Published! http://www.cafeconleche.org/books/xian3/ http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim
Received on Monday, 24 January 2005 16:51:40 UTC