- From: Aleksey Sanin <aleksey@aleksey.com>
- Date: Wed, 05 Jun 2002 01:42:05 -0700
- To: Christian Geuer-Pollmann <geuer-pollmann@nue.et-inf.uni-siegen.de>
- Cc: Joseph Reagle <reagle@w3.org>, John Boyer <jboyer@PureEdge.com>, "Donald E. Eastlake 3rd" <Donald.Eastlake@Motorola.com>, w3c-ietf-xmldsig@w3.org
Yes, I agree with this. But I do not like the case
when we have following document
<foo:Something xmlns:foo="http://example.org/foo">
<bar:Something xmlns:bar="http://example.org/bar" />
</foo:Something>
the XPath expression looks like
self::bar:Something
and the canonicalized output looks like
xmlns:foo="http://example.org/foo"
<bar:Something xmlns:bar="http://example.org/bar"
xmlns:foo="http://example.org/foo" />
From my point of view it should look have *only* <bar:Somehting />
<bar:Something xmlns:bar="http://example.org/bar" />
So the suggestion is to include all visibly utilized namespace nodes and
simply
do not check namespace nodes for presence in the input nodes set.
Aleksey.
Christian Geuer-Pollmann wrote:
>
>
> Right. What I suggest is to include ALL namespace nodes (not the ones
> from the DOM model but the ones from the XPath model). Every namespace
> node in an ancestor should effect the current node, not only the
> namespaces nodes whose owner element is in the document subset.
>
> Thanks,
> Christian
>
> --On Mittwoch, 5. Juni 2002 01:25 -0700 Aleksey Sanin
> <aleksey@aleksey.com> wrote:
>
>> I like this proposal for its simplicity but I have one small correction:
>>
>> "If a document subset is to be canonicalized using 'Exclusive C14n',
>> all namespace nodes for all element nodes in the document subset
>> are included in the document subset prior the serialization process."
>>
>> (the difference is that we include only namespace nodes for the
>> nodes from the input nodeset, not all nodes from the document).
>> This is similar to the current XPath filter 2 subtrees proposal and
>> it should simplify implementations and potentially improve performance.
>>
>>
>> Aleksey.
>>
>>
>> Christian Geuer-Pollmann wrote:
>>
>>>
>>> Hi all,
>>>
>>> first a big thank you to Merlin who made the very cool edge-cases for
>>> c14n and exclC14n to understand how these standards handle the
>>> namespace stuff. Till a few weeks ago, I did not understood that a
>>> properly choosen document subset (in c14n) can exclude namespaces from
>>> the documents subset. For me, namespaces were not 'regular' nodes but
>>> they were inseparable twisted with the document.
>>>
>>> For "Canonical XML", I see that the possibility to include only
>>> particular namespaces to a document subset is really cool if a
>>> transfroms author wants to create context-independent document subsets.
>>>
>>> For "Exclusive Canonical XML", I don't see why we have to inherit the
>>> (complicated) namespace handling from "Canonical XML".
>>>
>>> Provokant proposal: If the PR-Status of exclC14n allows this
>>> (substantial) change, I want to propagate to canonicalize document
>>> subsets as follows:
>>>
>>> "If a document subset is to be canonicalized using 'Exclusive C14n',
>>> all namespace nodes in the original document are included in the
>>> document subset prior the serialization process; this inclusion is
>>> done regardless whether a namespace node is already in the subset
>>> or if it's excluded from the subset."
>>>
>>> After that 'pre-processing', the exclusive c14n process is started
>>> with the following change: All passages in the text which refer to
>>> namespace nodes which are not in the document subset can be omitted.
>>>
>>> Why do I suggest that: For standard c14n, it was necessary to be able
>>> to omit namespace nodes from the document subset. For exclusive c14n,
>>> we have (1) the mechanism of the "InclusiveNamespaces PrefixList" and
>>> (2) the visibly-utilizes mechanism. I think that such a change will
>>> make exclusive c14n reliable and consistent (not consistent to the
>>> c14n REC but consistent to what c14n should really do).
>>>
>>> I think canonicalization should serve two purposes:
>>>
>>> (1) create a bit-accurate representation of a document
>>> or document subset for use in cryptographic algorithms
>>> like a message digest
>>>
>>> (2) allow the verifier of a signature to take these signed
>>> octets and re-parse the octets to get back a
>>> "trusted" XML structure which can be reliably used in
>>> the application. This goes to "process-what-is-signed".
>>> But with the current processing model where namespaces
>>> can be excluded from the document subset, it's possible
>>> that a "reparse signed contents" step does encounter
>>> 'illegal' XML.
>>>
>>> I had no better word as 'illegal'. I know that it's possible that the
>>> signed contents are not well-formed, e.g. like this:
>>>
>>> <A /><B />
>>>
>>> or like this
>>>
>>> foo text <A />
>>>
>>> but these are problems which can be handled easily by "wrapping" the
>>> octets into a dummy root element. But if a namespace is used e.g. by
>>> an element but the namespace decl does not appear, this can't be
>>> handled in any way, and from the semantics point, it's even completely
>>> meaningless:
>>>
>>> <foo:A>
>>> <foo:B xmlns:foo="http://foo" />
>>> </foo:A>
>>>
>>> In this case, the namespace is (maybe accidently?) omitted from the
>>> foo:A element, but what happens if we have such an input document:
>>>
>>> <foo:Contract xmlns:foo="http://companyA.com">
>>> <foo:Detail xmlns:foo="http://companyB.com" />
>>> </foo:Contract>
>>>
>>> and I choose a rogue document subset which results in
>>>
>>> <foo:Contract xmlns:foo="http://companyA.com">
>>> <foo:Detail />
>>> </foo:Contract>
>>>
>>> That's so bad; I think that the above proposal will stop that kind of
>>> cheating: foo:Detail visibly utilizes foo and so
>>> xmlns:foo="http://companyB.com" is output in the exclusive canonical
>>> form, regardless whether the XPath transform author did include it or
>>> not.
>>>
>>>
>>>
>>> Kind regards,
>>> hope that you all don't eat me alive for this ;-)
>>>
>>> Christian
>>
Received on Wednesday, 5 June 2002 04:40:16 UTC