- From: John Boyer <boyerj@ca.ibm.com>
- Date: Mon, 6 Mar 2006 07:58:18 -0800
- To: daniel@veillard.com
- Cc: "Henry S. Thompson" <ht@inf.ed.ac.uk>, public-xml-core-wg@w3.org, public-xml-core-wg-request@w3.org
- Message-ID: <OF46EEFA53.8AB8F323-ON88257129.0055B3A8-88257129.0057BCE7@ca.ibm.com>
Hi Daniel and Henry, What I meant to clear up in the prior email was that the inheritance rule is a good thing that *never* produces the wrong result when it is actually invoked. So, hopefully this will assuage your concerns about retaining the rule for xml:base. The problem is that the core team observed that there are cases involving the use of relative URIs in xml:base in which the omission of an intervening xml:base causes the expressed xml:base attributes in the portion of the document being retained to have altered meaning. The fact is that you simply cannot save a document author from himself when it comes to omission. I personally believe that although it is *possible* to use an xpath filter to orphan an element, it should just never be done in practice because too much semantics are typically associated with the ancestors of an element. The loss of fragments of relative URI paths in xml:base is but one example of this. >From the example of the prior email, it is easy to see that c as a child of a may mean something completely different than c as a child of b. There is nothing that can be done to protect authors from this kind of information loss if they don't understand this aspect of their schema. But again, it's not a security problem that arises *because* of the inheritance rule. It is an orthogonal security problem, and an extreme edge case, that authors could experience if they *express* an xml:base (non-inherited) on a node *and* it is orphaned by a filter *and* the xml:base contains a relative URI. While the inheritance rule has nothing to do with addressing this problem (whether it should be addressed notwithstanding), the inheritance rule does remove a certain number of other security issues, so there is certainly no harm in retaining it. As to whether it should be addressed or not, the issue remains that had base not been added to the xml namespace, we wouldn't even be having this discussion. In other words, there are lots of relative URIs used in XML vocabularies (e.g. the src attribute), and even if we were to attempt a fix for xml:base, it does not protect the document author from loss of ancestor information Best regards, John M. Boyer, Ph.D. Senior Product Architect/Research Scientist Co-Chair, W3C XForms Working Group Workplace, Portal and Collaboration Software IBM Victoria Software Lab E-Mail: boyerj@ca.ibm.com http://www.ibm.com/software/ Blog: http://www.ibm.com/developerworks/blogs/boyer Daniel Veillard <daniel@veillard.com> Sent by: public-xml-core-wg-request@w3.org 03/06/2006 01:36 AM Please respond to daniel To "Henry S. Thompson" <ht@inf.ed.ac.uk> cc John Boyer/CanWest/IBM@IBMCA, public-xml-core-wg@w3.org Subject Re: Appling inheritance rule to xml:base, was Re: FINAL minutes for the XML On Mon, Mar 06, 2006 at 02:39:48AM +0000, Henry S. Thompson wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I wasn't at the f2f, for which apologies, but I find myself made > uneasy by the proposal to retain 'inheritance' of xml:base. As you Same here. It means silent breakage of the document at canonicalization time, this must be avoided. > say, this doesn't always give the 'right' results. What I find > frustrating is that it's easy to state a strategy which _would_ always > give the 'right' answer, namely: > > "Use the name *EII* for an element information item to be > canonicalized, and *EIIC* for the element information item > corresponding to *EII* in the result of parsing the canonical > serialization of the node-set containing *EII*. > > "Synthesize an xml:base attribute for *EII* iff the *EIIC*'s [base > URI] would otherwise be different from *EII*'s [base URI]." > > This has the advantage that not only does it correctly produce > > <a xml:base="http://example.org"> > <c xml:base="test"/> > </a> > > from > > <a xml:base="http://example.org"> > <b xml:base="test"> > <c/> > </b> > </a> > > when <b>...</b> is filtered out, but it will _also_ correctly produce > > <a xml:base="http://example.org"> > <c xml:base="http://example.org/test/test"/> > </a> > > > from > > <a xml:base="http://example.org"> > <b xml:base="test"> > <c xml:base="test"/> > </b> > </a> > > when <b>...</b> is filtered out. Note: I'm not sure the examples really convey what they should, if in the example we used <b xml:base="test/a"> then the composition would lead to a test/test base on c, but as written I do think the composition is still 'test' in the result I assume John didn't tried to apply the computation from RFC2396 * manually or maybe the example we had at the f2f were misleading or broken. If all xml:base reference resources in the same "directory" basically the composition problem doesn't appear in practice. [*] or later Hum, assume you don't have a fixed base on a, then you force generating a base depending on the document base,, which mean suddenly canonicalization of a document depends on how you retrieved it (e.g. a file access would end up with file:///localpath/test/test while from a web access you would get http://example.org/test/test , I don't think it's acceptable either. > Can't we come up with a way to get this effect? I definitely prefer a solution leading to false negative i.e. we fail to canonicalize in the same way, than a situation leading to false positive where the canonicalization result in a broken result. We already discussed in the past especially with Richard generating relative xml:base when possible, maybe we need to formalize this and put it as the algorithm to compute the canonicalized result. Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ |
Received on Monday, 6 March 2006 15:58:47 UTC