RE: Updated c14n Spec from John Boyer on 2000-06-16 (w3c-ietf-xmldsig@w3.org from April to June 2000)

From: John Boyer <jboyer@PureEdge.com>
Date: Fri, 16 Jun 2000 14:13:03 -0700
To: "Petteri Stenius" <Petteri.Stenius@remtec.fi>, "'David Blondeau'" <blondeau@intalio.com>, <w3c-ietf-xmldsig@w3.org>
Cc: "XML DSig" <w3c-ietf-xmldsig@w3.org>
Message-ID: <BFEDKCINEPLBDLODCODKGEFECDAA.jboyer@PureEdge.com>
Hi Petteri and David,

To be honest, my current acceptance of the namespace propagation as done by
XPath was not based on simply accepting default behavior.  I gave it a very
great deal of thought.  Moreover, in accepting the current behavior, I was
mindful of the desire to perform whole-document serialization in a one-pass
fashion, so I would be quite interested to know why you (Petteri) think the
XPath version of namespace context identification cannot be serialized in a
single pass.  To me, this seems to me to be as easy as following the
namespace rules in the January c14n spec, which you say can be serialized in
a single pass.  But before I get into that any further...

I would point out that, in David's statement about the January spec being
easier because it didn't care about prefixes in attribute values, I believe
he means easier than Petteri's suggestion, not easier than the current c14n.
The January spec did not care about prefixes in attribute values and did
things that sometimes broke them.  The new spec does not break them, but it
also does not really care about them, i.e. attempt to detect their
existence, which would be quite impossible without an application context.
In both the January spec and the current specs, namespace prefix references
that appear in attribute values or element character content are simply more
character data to be written out.

Also, Petteri, in the example you gave below, yes it is true that the expr
value both is valid XML and does make reference to an undefined namespace
prefix.  However, an Xpath with an undefined namespace prefix would generate
an error within the application attempting to use it.  More to the point,
though, the example does not seem to be a counterexample of David's point.
David's point is about namespaces that are in scope, yet there would be no
way to identify that its declaration is needed.  In other words, the point
is about attribute values that carry XPath expressions that used to work
before canonicalization and don't work after canonicalization.

It is a non-goal of the current c14n to say d1 and d2 are logically
equivalent if and only if c14n(d1)==c14n(d2).  However, it is the intent of
the current c14n is to say:  if c14n(d1)==c14n(d2), then d1 is logically
equivalent to d2.  Otherwise, there would be no point to c14n, especially
for dsig.  Let d1 be a document containing a working XPath that no longer
works in c14n(d1) because we omitted a namespace declaration that did not
appear to be used.  Let d2=c14n(d1).  Now, clearly c14n(d1) == c14n(d2), so
we expect that d1 and d2 are logically equivalent, but they aren't because
the Xpath works in d1 but not d2. This is the same argument against
namespace prefix rewriting.  I will add a section to the appendix to explain
why this change was made, too.

Despite this problem with your particular proposal, I am quite sympathetic
to your cause, Petteri, and thought about other alternatives that would
work.  Let's begin with just doing something that works for a whole document
(sans comments of course).  So, ignore the notion of document subsets for a
moment.  Also, ignore the default namespace declaration for the moment, and
let's focus on actual namespace declarations.  If a given element and its
parent both have the same namespace declared to be equal to the same URI,
then the namespace declaration could be omitted from the child.  Since we
perform a standard depth first descent of the parse tree, this means that we
could retain all relevant namespace declarations, but still only use local
operations-- we need only consider the namespace context of an element and
its parent.

The problem is not much harder when you add the complexity of the default
namespace declaration.  Whether it's empty or not, the namespace context
indicates its value in some way, so if the default namespace of an element
differs from its parent (whether empty or not), then render a default
namespace declaration for the element.

The case for document subsets is not much harder, except we would replace
the notion of parent with the notion of ancestor *in the node set* with
Finally, when dealing with document subsets, one is certainly using XPath,
and the problem is not really too hard.  For example, given a namespace node
N in the resultant node-set, we could do the following:

1) Find the element E that owns N (even if it is not in the node-set,
however weird that might be).

2) Find the nearest ancestor A of E that is in the node-set.  If A doesn't
exist, then output N.  If A exists in the node-set and it has a namespace
node N(A) *that is in the node-set* which declares the same namespace AND
assigns it to the same URI, then omit N from the output.  Otherwise, output
N.

So, as you can see, I've been trying to think about getting rid of
unnecessary namespace declarations.  However, there seems to be enough work
involved in trying to figure out whether to print a namespace node (esp. in
the document subset case) that it did not seem to worthwhile to complicate
the spec.  In particular, one must still maintain the whole namespace
context for each element as one passes through a document.  I will reassert
that this can be done in a one-pass fashion given space linear in the size
of the namespace context, which should not be a problem even for the most
rudimentary of devices capable of processing XML.  This is why I've retained
the default XPath namespace propagation feature.

This is not actually a complete account of my thinking on this issue, but
I'll end this now unless there is an expressed need for me to continue.

John Boyer
Software Development Manager
PureEdge Solutions Inc. (formerly UWI.Com)
Creating Binding E-Commerce
jboyer@PureEdge.com


-----Original Message-----
From: w3c-ietf-xmldsig-request@w3.org
[mailto:w3c-ietf-xmldsig-request@w3.org]On Behalf Of Petteri Stenius
Sent: Thursday, June 15, 2000 11:19 PM
To: 'David Blondeau'; w3c-ietf-xmldsig@w3.org
Cc: XML DSig
Subject: RE: Updated c14n Spec



You assume that the XML document constructor has properly declared all
namespaces that appear in XML attribute values. Of course there is nothing
an XML processor can do to verify this. A short sample:

<doc>
	<reference expr="foo:bar"/>
	<foo:bar xmlns:foo="uri"/>
</doc>

This is completely valid XML and the namespaces axis of the 'reference'
element is empty even if the attribute value refers to a namespace. One does
not have to declare the namespace at the 'doc' element level, and if this
document was constructed using DOM then the above XML representation would
most likely be the result.


Petteri


> -----Original Message-----
> From: David Blondeau [mailto:blondeau@intalio.com]
> Sent: Thursday, June 15, 2000 12:06 AM
> To: w3c-ietf-xmldsig@w3.org
> Cc: XML DSig
> Subject: Re: Updated c14n Spec
>
>
> > I need to read the attributes of a element anyway, and I
> also need to sort
> > them using the attribute name and the namespace uri as sort
> keys. This is
> > the minimum requirement in all cases.
> I just wanted to show that your suggestion was worst than the
> one in the
> draft because you have to be carefull about namespace prefixes used in
> attributes values. Your suggestion was to put  namespaces
> only when they are
> used, my question is then: how do you know a prefix is used
> in an attribute
> value?
> For that, you need to know all the prefixes in scope so you
> need to walk on
> the tree to get the namespaces prefixes, and then do a really
> difficult
> parsing job...
> No matter how you are doing it, you need all the namespace
> declarations in
> scope for each element.
>
> The january draft of C14n was easier on this point since it
> didn't care
> about prefixes in attribute values.
>
> David
>
>
Received on Friday, 16 June 2000 17:13:26 UTC