Re: Trying to assess the depth of xml:id and c14n incompatibilities from Tim Berners-Lee on 2005-02-12 (www-tag@w3.org from February 2005)

From: Tim Berners-Lee <timbl@w3.org>
Date: Sat, 12 Feb 2005 13:19:43 -0500
To: veillard@redhat.com
Cc: www-tag@w3.org, public-xml-id@w3.org, public-xml-core-wg@w3.org
Message-Id: <AB00C2C0-7D22-11D9-9414-000A9580D8C0@w3.org>
On Feb 12, 2005, at 11:02, Daniel Veillard wrote:
[...]
>   So to check I understand correctly, the "Canonical XML" spec requires
> that if one want to canonicalize a set of elements (which are 
> siblings),
> and if those elements have a common ancestor carrying an xml:id and 
> that
> no closer ancestor (or self) carry an xml:id then the xml:id attribute
> and its value must be copied on those element. For example
>
>    <root xml:id="root">
>          <data>
> 	       <child1/>
> 	       <child2>
> 	           <sub/>
> 	       </child2>
> 	       <child3/>
> 	 </data>
>    </root>
>
>  then to make a canonical serialization of the set of children of
> data, they need to be serialized as
>
> 	       <child1 xml:id="root"/>
> 	       <child2 xml:id="root">
> 	           <sub/>
> 	       </child2>
> 	       <child3 xml:id="root"/>
>
> (ignoring the text nodes before child1 and after child3 for this 
> example).
>
>   Now where is the problem exactly ?
>   From an XML-1.0 + Namespace point of view the serialized fragment 
> obtained
> that way is still perfectly okay (i.e. a well balanced chunk), the only
> problem which may arise are:
>    1/ layers implementing xml:id will raise an error, however this is
>       not a fatal error (see http://www.w3.org/TR/xml-id/#errors)
>       xml:id processors are just instructed to report the duplicate ID
>       error to the application using it
>    2/ XPath pointers to that fragment can be disrupted
>

If you go down this path, then the xml:id spec could I suppose
say that if there are >1 id of the same name then the outermost one
is the relevant one, and that would even ensure the fragid reference
still works


>   I think 1/ proves that the current option of making xml:id errors
> non fatal is the correct handling. For example if an application with
> xml:id support uses an existing digital signatures library, upon 
> checking
> the output it is possible to detect the error and possibly to correct 
> it
> as soon as possible.
>   With respect to 2/ there is multiple cases:
>     - I think the worse case is when the extra xml:id generated would
>       override an exising ID, for example if child2 already hold 
> before the
>       transformation an ID attribute of value "root", it is however 
> clear
>       that in that case the source document was in error w.r.t.  
> xml:id.
>       i.e. false positive when looking for an ID in the fragment output
>       can only be the result of an IDness problem in the initial 
> document.
>     - another case is when the output of the canonicalization process 
> is
>       included as a fragment in another document (which I expect since
>       well balanced chunk like generated above are not well formed as 
> is
>       due to the presence of multiple roots) then at inclusion time 
> IDness
>       consistancy should be checked, xml:id errors like that are just a
>       special case of ID errors which may result from such an inclusion
>       duplicate ID detection in the fragment are just a special case 
> of the
>       needed detection for the full document.
>     - the last problem I see is that the inherited xml:id on the 
> serialized
>       fragment, simply generate extra ID, note that this does not break
>       existing pointers inside the fragment or inside a regenerated
>       document built around the frament. By the definition of XPointer
>       for bare name, this is a fallback to the XPath id() function 
> which
>       as explained in the XPath spec will point to the first element in
>       document order hlding an attribute of type ID with that value. 
> The
>       fact that there is possibly multiple xml:id generated is no more
>       of a problem than having a single one, what the xml:id actually 
> provide
>       is that the application will be alerted by the mismatch.
>
>   So while I think the incompatibility between Canonical XML and xml:id
> need to be investigated, in practice the effect of this incompatibility
> doesn't look like to be worth blocking xml:id from going forward or
> mandating an update to the Canonicalization spec (though people should
> clearly prefer Exclusive XML Canonicalization which doesn't exhibit the
> problem). Seems to me the drawback should be clerly documented in 
> xml:id
> and xml:id implementors should make sure that they report duplicate ID 
> errors
> especially in a context where canonicalization or digital signatures 
> might
> be used.
>
>   Of course I may have missed more critical side-effects that those
> extra xml:id in canonicalized output may generate, and if this is the
> case reporting them will be a good idea :-)
>
>    yours,
>
> Daniel
> -- 
> Daniel Veillard      | Red Hat Desktop team http://redhat.com/
> veillard@redhat.com  | libxml GNOME XML XSLT toolkit  
> http://xmlsoft.org/
> http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
Received on Saturday, 12 February 2005 18:19:51 UTC