Re: serialization and xml wrapping from merlin on 2002-10-31 (xml-encryption@w3.org from October 2002)

From: merlin <merlin@baltimore.ie>
Date: Thu, 31 Oct 2002 17:27:45 +0000
To: reagle@w3.org, arik@phaos.com
Cc: xml-encryption@w3.org
Message-Id: <20021031172745.242F543C14@yog-sothoth.ie.baltimore.com>
Forgot to cc the list:

------- Forwarded Message

Date:    Thu, 31 Oct 2002 04:26:36 +0000
From:    merlin <merlin@baltimore.ie>
To:      reagle@w3.org
Subject: Re: Fwd: Re: Re: serialization and xml wrapping 


Hi Joseph,

Sorry about being slow; just buried with stuff..

2:

>> > >Further: [3] indicates, for Step 2 of  decryptXML(N, E), that: "A
>> > > namespace declaration xmlns="" MUST be emitted with every apex
>> > > element that has no namespace prefix and URI as described in
>> > > Serializing XML [XML-Encryption, section 4.3.3]". Firstly, we're
>> > > talking about the apex
>> > > elements in a node-set, which might include namespace nodes for the
>> > > default namespace inherited from the dummy element in prior
>> > > wrapping/parsing -- this means that an element without a namespace
>> > > prefix is not necessarily without a namespace, and emitting xmlns=""
>> > > would conflict with emission of the namespace node in the
>> > > node-set.

Perhaps change the language to something like this:

  A namespace declaration xmlns="" MUST be emitted with every apex
  element that has no namespace node declaring a value for the
  default namespace.

Consider:

  <Foo xmlns="bar"><ToBeReplaced /></Foo>
  {dummy}<Replacement />{/dummy}

If I used straight c14n for the replacement, we'd get:

  <Foo xmlns="bar"><Replacement /></Foo>

This is wrong. The Replacement elementis an apex element of
a replacement node set with no default namespace URI, so it
should be augmented:

  <Foo xmlns="bar"><Replacement xmlns="" /></Foo>

The language in the spec currently is too weak; it will fail
the case of:

  {dummy}<foo:Replacement xmlns:foo="bar"><Bar /></foo:Replacement>{/dummy}

This isn't the case that Ari suggests; I think the case that
Ari suggests is handled.


3.3#6 needs to be changed similarly; ``xmlns="" is emitted on any apex
elements that do not declare a value for the default namespace'' or
somesuch.

1:

>> 1. We should make the text less confusing... 4.3.3 Starts off by saying,
>> "When serializing an XML fragment for subsequent wrapping and parsing,
>> serialization of the namespace axis should be changed as follows". This
>> makes it sound like it's part of the decryption process. If so, exactly
>> which step of 4.3.{1,2} is this relevant? (Text Wrapping is explicitly
>> called in 4.3.1 and 4.3.2)  There's probably no hook in 4.3.2 because
>> it's a nodeset sorta-based replacement mechanism. Whereas the one in the
>> Decryption transform is a much more specific octet based canonicalization
>> and replace based mechanism. Is that right? If so, how do we make that more
>> clear? Be more explicit that 4.3.2 is a node-set based mechanism, but 4.3.3
>> only applies to mechanisms like those in the Decryption Transform?

Actually, this is an *encryption* process. This is a non-normative
suggestion for how to serialize XML into an octet stream for
subsequent encryption.

The encryption step, 4.1#3.1 states 'obtain the octets'. 4.3.3
is a recommendation for how to do this by augmenting a c14n
algorithm. We leave the process wide open; however, if people
na�vely use c14n then they leave themselves open to the
problem described and solved by 4.3.3.

Perhaps replace 4.3.3 with this:

When serializing XML data during encryption (step 4.1#3.1),
special care SHOULD be taken: If the data will subsequently
be decrypted in the context of a parent XML document then
na�ve serialization may produce unexpected decryption results:

Consider the following fragment of XML:

  <Document xmlns="http://example.org/">
    <ToBeEncrypted xmlns="" />
  </Document>

A simple serializer (e.g., c14n) would serialize the
ToBeEncrypted element in isolation as the octet stream
"<ToBeEncrypted></ToBeEncrypted>", resulting in the following
encrypted document:
  <Document xmlns="http://example.org/">
    <EncryptedData xmlns="...">
      <!-- <ToBeEncrypted></ToBeEncrypted> -->
    </EncryptedData>
  </Document>

In-place decryption of this document would produce
the following incorrect result:

  <Document xmlns="http://example.org/">
    <ToBeEncrypted />
  </Document>

This problem arises because normal XML serializations assume
that the serialized data will be parsed directly in a context
where there is no default namespace declaration. Consequently,
they do not redundantly declare the empty default namespace
with an xmlns="". If, however, the serialized data are parsed
in a context where a default namespace declaration is in scope
(e.g., the parsing context of 4.3.1), then it may affect the
interpretation of the serialized data.

To solve this problem, a standard canonicalization algorithm
MAY be augmented as follows for use as an XML encryption
serializer:

. A default namespace declaration with an empty value (i.e.,
  xmlns="") SHOULD be emitted where it would normally be
  suppressed by the canonicalization algorithm.

While the result may not be in proper canonical form,
this is harmless as the resulting octet stream will not
be used directly in a signature computation. Considering
the preceding example, the ToBeEncrypted element would be
serialized as follows:

  <ToBeEncrypted xmlns=""></ToBeEncrypted>

When processed in the context of the parent document, this
serialized fragment will parse correctly.

This augmentation can be retroactively applied to an existing
canonicalization implementation by canonicalizing each apex
node and its descendants from the node set, inserting xmlns=""
at the appropriate points, and concatenating the resulting
octet streams.

Similar attention between......non-empty values.



Merlin

r/reagle@w3.org/2002.10.29/17:13:57
>
>The XENC PR's end this week, so getting clarity on this bit so we can stage 
>the REC would be a "good thing" (tm) <smile/>.
>
>----------  Forwarded Message  ----------
>
>Subject: Re: Re: serialization and xml wrapping
>Date: Fri, 18 Oct 2002 14:29:04 -0400
>From: "Ari Kermaier" <arik@phaos.com>
>To: <reagle@w3.org>
>
>The first issue below is the part of the spec that really needs
>clarification, but I was able figure out what to do in my implementation to
>achieve the desired result. Your statement of the confusion is exactly what
>was bothering me -- I don't have a proposed change to address it just yet
>though. :-)
>
>Regarding the second issue, emitting xmlns="" where there's no prefix
>namespace URI: I went ahead with the assumption that this is just plain
>incorrect, and I should really emit xmlns="" where the element has no
>namespace prefix and has no default namespace node in the node-set.
>
>----- Original Message -----
>From: "Joseph Reagle" <reagle@w3.org>
>To: "Ari Kermaier" <arik@phaos.com>
>Sent: Friday, October 18, 2002 2:05 PM
>Subject: Fwd: Re: serialization and xml wrapping
>
>> Ooops, sorry. In the previous issue, I was speaking about *this* issue.
>>
>> ----------  Forwarded Message  ----------
>>
>> Subject: Re: serialization and xml wrapping
>> Date: Thu, 26 Sep 2002 17:55:42 -0400
>> From: Joseph Reagle <reagle@w3.org>
>> To: merlin <merlin@baltimore.ie>, "Ari Kermaier" <arik@phaos.com>
>> Cc: "W3C XML-ENC WG List" <xml-encryption@w3.org>
>>
>> On Thursday 26 September 2002 03:18 pm, merlin wrote:
>> > So, the example doesn't follow our implicit recommendation
>> > of using c14n, but instead relies on inheriting namespace
>> > information from the surrounding document.
>> >
>> > Does this make sense?
>>
>> It makes some sense, but I'm still not sure about the interplay between
>> these steps:
>> 1. We should make the text less confusing... 4.3.3 Starts off by saying,
>> "When serializing an XML fragment for subsequent wrapping and parsing,
>> serialization of the namespace axis should be changed as follows". This
>> makes it sound like it's part of the decryption process. If so, exactly
>> which step of 4.3.{1,2} is this relevant? (Text Wrapping is explicitly
>> called in 4.3.1 and 4.3.2)  There's probably no hook in 4.3.2 because
>> it's a nodeset sorta-based replacement mechanism. Whereas the one in the
>> Decryption transform is a much more specific octet based canonicalization
>> and replace based mechanism. Is that right? If so, how do we make that
>
>more
>
>> clear? Be more explicit that 4.3.2 is a node-set based mechanism, but
>
>4.3.3
>
>> only applies to mechanisms like those in the Decryption Transform?
>> 2. What about Ari's comment that in the Decryption Transform,
>
>decryptXML(N,
>
>> E) only emits xmlns="" when there's no prefix *and* URI. (Or does this go
>> away with the proper understanding of the above? <smile/>)
>>
>> > >Further: [3] indicates, for Step 2 of  decryptXML(N, E), that: "A
>> > > namespace declaration xmlns="" MUST be emitted with every apex
>> > > element that has no namespace prefix and URI as described in
>> > > Serializing XML [XML-Encryption, section 4.3.3]". Firstly, we're
>> > > talking about the
>
>apex
>
>> > > elements in a node-set, which might include namespace nodes for the
>> > > default namespace inherited from the dummy element in prior
>> > > wrapping/parsing -- this means that an element without a namespace
>> > > prefix is not necessarily without a namespace, and emitting xmlns=""
>> > > would conflict with emission of the namespace node in the node-set.
>>
>> -------------------------------------------------------
>>
>> --
>> Joseph Reagle Jr.                 http://www.w3.org/People/Reagle/
>> W3C Policy Analyst                mailto:reagle@w3.org
>> IETF/W3C XML-Signature Co-Chair   http://www.w3.org/Signature/
>> W3C XML Encryption Chair          http://www.w3.org/Encryption/2001/
>
>-------------------------------------------------------
>
>-- 
>Joseph Reagle Jr.                 http://www.w3.org/People/Reagle/
>W3C Policy Analyst                mailto:reagle@w3.org
>IETF/W3C XML-Signature Co-Chair   http://www.w3.org/Signature/
>W3C XML Encryption Chair          http://www.w3.org/Encryption/2001/
>

------- End of Forwarded Message
Received on Thursday, 31 October 2002 12:30:08 UTC