- From: Elliotte Harold <elharo@metalab.unc.edu>
- Date: Sat, 12 Feb 2005 16:58:49 -0500
- To: veillard@redhat.com
- CC: www-tag@w3.org, public-xml-core-wg@w3.org, public-xml-id@w3.org
Daniel Veillard wrote: > Now where is the problem exactly ? > From an XML-1.0 + Namespace point of view the serialized fragment obtained > that way is still perfectly okay (i.e. a well balanced chunk), the only > problem which may arise are: > 1/ layers implementing xml:id will raise an error, however this is > not a fatal error (see http://www.w3.org/TR/xml-id/#errors) > xml:id processors are just instructed to report the duplicate ID > error to the application using it > 2/ XPath pointers to that fragment can be disrupted > The issue for me is a little different. It's that someone can deliberately place an ID on an element, and the process of canonicalization can move that ID to a different element. That it may move the ID to several elements is even funkier, but even if it could only move it to a single different element, it would be a problem. "id" stands for identifier. The value of this element is supposed to uniquely identify not just any element, but a particular element. This identification can be used in many contexts: XPath, XPointer, XSLT, DOM, all sorts of custom written programs, and more. I claim that any process that, as an unintended side effect, moves IDs from one element to a different element, is deeply flawed. For instance, somebody may use the sequential numbers cc1, cc2, cc3 and so forth to find all the credit_card elements in a document. Canonicalization of xml:id could move those IDs onto person elements or expiration date elements, or something else. I can't begin to imagine all the different ways this could cause the trouble. The problem is simply that IDs can unexpectedly move from the element they are intended for, to an element that they were not intended for. This will cause applications to choose the wrong elements. How that affects any given application will vary from one application to the next, of course; but I can't help but think that some of the applications will have really major, potentially disastrous problems as a result of IDs unexpectedly moving following the process of canonicalization. While we can call out the potential problems in the spec, and warn people who use xml:id to only use exclusive canonicalization, I fear that someone is going to be receiving documents they did not write that use xml:id, and processing them with tool chains that have not been updated. In other words, they may never have even looked at the xml:id spec but nonetheless be affected by this problem. I really think we need to eliminate the problem at the source by replacing xml:id with some attribute that does not have this unintended interaction with canonicalization. -- Elliotte Rusty Harold elharo@metalab.unc.edu XML in a Nutshell 3rd Edition Just Published! http://www.cafeconleche.org/books/xian3/ http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim
Received on Saturday, 12 February 2005 21:58:52 UTC