Re: Interop meeting report from Glenn Marcy on 2007-10-31 (public-xml-core-wg@w3.org from October 2007)

From: Glenn Marcy <gmarcy@us.ibm.com>
Date: Wed, 31 Oct 2007 10:50:49 -0400
To: public-xmlsec-maintwg@w3.org
Cc: public-xml-core-wg@w3.org
Message-ID: <OF7A0E12C0.DADF6934-ON85257385.004D571F-85257385.00518E1C@us.ibm.com>
I have been considering how to integrate the feedback from Issue 2
identified in the note below into a new revision of the C14N 1.1
specification.  Since the main motivation is that the current
language is unclear to xmlsec implementers, I am sensitive to the
possibility that I might simply replace existing unclear language
with newer prose that is not sufficiently clearer.

I would like to solicit feedback on how to best improve the wording
to address the concerns of the participants in this working group.

Regards,
Glenn Marcy
C14N 1.1 Editor

[1] 
http://lists.w3.org/Archives/Public/public-xmlsec-maintwg/2007Oct/0000.html

----- Forwarded message from Thomas Roessler <tlr@w3.org> -----

From: Thomas Roessler <tlr@w3.org>
To: www-xml-canonicalization-comments@w3.org
Date: Thu, 4 Oct 2007 05:50:24 +0200
Subject: Interop meeting report

The XML Security Specifications Maintenance Working Group held an
interoperability testing meeting for the XML Digital Signatures and
Canonical XML 1.1 specifications in Mountain View, California, on 27
September 2007. The meeting was hosted by VeriSign.

The participating implementors were IBM, Oracle, UPC, Sun, IAIK.

A full interoperability report is not available at this time.


The following three issues with the Canonical XML 1.1 specification
were identified.



1. The change back to language from C14N 1.0 that is suggested in
[1] should be applied, as it matches implementation behavior.



2. The fix-up for the xml:base attribute that is specified in
section 2.4 [2] was not implemented interoperably.
 
A single implementation was found to have implemented the
specification's normative text correctly.  Four implementations were
found to be consistent with the example in section 3.8 [3]. The
example in section 3.8 was found to be inconsistent with the
normative text.

After discussion, there was consensus that the normative text is
correct (but in need of clarification), and that the example
provided in the specification is indeed incorrect. 

The issue at hand can best be seen by considering a slight variant
of the example in section 3.8.  Instead of using the following input
document:

| <!DOCTYPE doc [
| <!ATTLIST e2 xml:space (default|preserve) 'preserve'>
| <!ATTLIST e3 id ID #IMPLIED>
| ]>
| <doc xmlns="http://www.ietf.org" xmlns:w3c="http://www.w3.org"
| xml:base="http://www.example.com/something/else">
|    <e1>
|       <e2 xmlns="" xml:id="abc" xml:base="../bar/">
|          <e3 id="E3" xml:base="foo"/>
|       </e2>
|    </e1>
| </doc>

... consider this:

| <!DOCTYPE doc [
| <!ATTLIST e2 xml:space (default|preserve) 'preserve'>
| <!ATTLIST e3 id ID #IMPLIED>
| ]>
| <doc xmlns="http://www.ietf.org" xmlns:w3c="http://www.w3.org"
| xml:base="something/else">
|    <e1>
|       <e2 xmlns="" xml:id="abc" xml:base="bar/">
|          <e3 id="E3" xml:base="foo"/>
|       </e2>
|    </e1>
| </doc>

It is the participants' reading of the normative language that,
since e1 is preserved in the document subset, the fix-up for e3 will
only take e2 into account, but not e1 or doc. Canonicalization
consistent with this reading of the specification text will lead to
the following output (line breaks for convenience):

| <e1 xmlns="http://www.ietf.org" xmlns:w3c="http://www.w3.org"
| xml:base="something/else"><e3 xmlns=""
| id="E3" xml:base="bar/foo"
| xml:space="preserve"></e3></e1>

Canonicalization consistent with the current material in example
3.8, however, will lead to this output:

| <e1 xmlns="http://www.ietf.org" xmlns:w3c="http://www.w3.org"
| xml:base="something/else"><e3 xmlns=""
| id="E3" xml:base="something/bar/foo"
| xml:space="preserve"></e3></e1>

When base URI resolution is performed on this output, the string
"something" would be duplicated in e3's base URI.  That is not
consistent with e3's base URI in the input document.

In the normative specification text, the key phrase is this one:

                 Let E be an element in the node set whose ancestor axis
                 contains successive elements E_n ... E_1 (in reverse
                 document order) that are omitted and E=E_n+1 is included.
                 ...

The crucial word for a correct reading of this language is
"successive"; in the example given, it causes the sequence E_n ...
E_1 of omitted elements for E = e3 to consist the single element e2.

The experience gathered suggests that this aspect needs to be called
out much more prominently and clearly.


Additionally, the introductory paragraph ("The xml:base
attribute...") was found to be confusing, since it can be misread as
a (redundant) description of where the "join URI" function is to be
applied. We recommend shortening this paragraph to a simple
statement to introduce the "join URI" function.  We also recommend
renaming the "join URI" function into a "join URI references"
function, as that is what it does.

Further, that paragraph, the bullet list, and the subsequent
paragraph cause confusion by talking about "base URIs": The objects
of the canonicalization process are the string values of the various
xml:base attributes (which might be relative URI references).  They
are *not* the base URI properties of the element nodes in question
(which are always absolute URIs, and can depend upon the document's
context). We recommend clarifying this point and the terminology
used.

In the paragraph that starts with the words "Given this 'join URI'
function...", the following phrase causes further confusion:

                 The element nodes along E's ancestor axis are now 
examined
                 for all occurences of xml:base, that have been omitted.

This might be read to suggest that the fix-up might also be
applicable if the document subset includes an ancestor element F,
but lacks an xml:base attribute that was present on F's attribute
axis in the input document.  We recommend clarifying that this
phrase only deals with the removal of element nodes.

We further recommend including a general remark to note that the
various fix-up steps must be performed IF AND ONLY IF relevant
*element* nodes are removed, and that fix-up MUST NOT occur if an
element node is preserved in the document sub-set, but loses a
relevant attribute node.



3. Appendix A was found to be complex to the point of being
unimplementable.

While all participants were able to implement some algorithm with
the desired effect, that implementation was typically based on
analysis of test cases and reading of the overall specification, as
opposed to being a faithful implementation of the text in Appendix
A.  A characteristic remark by one implementer (which resonated with
the rest of the group) was that it was "easier to produce the
desired code than to attempt understanding Appendix A."

We recommend to rewrite Appendix A in a clear and simple fashion.
Where the (commendable!) aim of staying close to RFC 3986's language
gets into the way of clarity or simplicity, the latter should be
given priority.



1. 
http://lists.w3.org/Archives/Public/public-xml-core-wg/2007Aug/0018.html
2. http://www.w3.org/TR/xml-c14n11/#DocSubsets
3. http://www.w3.org/TR/xml-c14n11/#Example-DocSubsetsXMLAttrs
Received on Wednesday, 31 October 2007 14:51:04 UTC