AW: Call for Implementation: Canonical XML Becomes a W3C Candidate Recommendation from Gregor Karlinger on 2000-11-07 (w3c-ietf-xmldsig@w3.org from October to December 2000)

From: Gregor Karlinger <gregor.karlinger@iaik.at>
Date: Tue, 7 Nov 2000 09:13:33 +0100
To: "John Boyer" <jboyer@PureEdge.com>, "Gregor Karlinger" <gregor.karlinger@iaik.at>, "Joseph M. Reagle Jr." <reagle@w3.org>, "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org>
Message-ID: <NDBBIMACDKCOPBLEJCCDAEDECNAA.gregor.karlinger@iaik.at>

Hi John,

> You are asking about the difference between what an XML processor must do
> versus what information it must export to the application.
> 
> I don't recall anything that permits an XML processor to report less
> information than it derives from the input.  In particular, I don't recall
> anything that binds this concept to validating vs. non-validating
> processors.
> 
> The closest we come to it is in the Conformance section of [XML], where
> there is a reiteration of the idea that non-validating processors may vary
> in their information output "depending on whether the processor reads
> parameter and external entities".  But this is an instance of the 
> processor
> reporting less information because it has derived less from the possible
> input (because it didn't read the input).

Getting back to the initial problem, I should write down the some points:

  * Canonical XML does not require a validating parser to create the node set
    for the XPath processing.

    "The input octet stream MUST contain a well-formed XML document, but the 
     input need not be validated.[...]" [1]


  * The XPath id() function relies on the ID attribute mechanism of XML. But
    three validity constraints apply to such attributes, which can be verified
    only by a validating parser:

    "Values of type ID must match the Name production. A name must not appear
     more than once in an XML document as a value of this type; i.e., ID 
     values must uniquely identify the elements which bear them." [2]

    "No element type may have more than one ID attribute specified." [3]

    "An ID attribute must have a declared default of #IMPLIED or #REQUIRED." [4]


  * Therefore I think it is only possible to use the id() function in an XPath
    selecting a document subset for XML, if the document object model has been
    built with the help of a validating parser. This does not seem to be a contra-
    diction to what is said in Canonical XML, since the precondition for processing
    a document subset is that a node set is available. It does not say anything 
    about how to build such a node set.

    "Implementations of XML canonicalization that are based on XPath can provide
     this functionality with little additional overhead by accepting a node-set   
     as input rather than an octet stream." [5]


  * The only problem is the example in section 3.7, because it is lacking of a
    complete DTD, and therefore cannot be processed by a validating parser.
    So, it would be best to add the remaining parts of the DTD, and additionally
    provide some textual description about this problem of using the id() function.

Regards, Gregor
---------------------------------------------------------------
Gregor Karlinger
mailto:gregor.karlinger@iaik.at
http://www.iaik.at

Phone +43 316 873 5541
Institute for Applied Information Processing and Communications
Austria
---------------------------------------------------------------
 

---
[1] http://www.w3.org/TR/2000/WD-xml-c14n-20001011#DataModel, 5th paragraph
[2] http://www.w3.org/TR/REC-xml#id

[3] http://www.w3.org/TR/REC-xml#one-id-per-el

[4] http://www.w3.org/TR/REC-xml#id-default

[5] http://www.w3.org/TR/2000/WD-xml-c14n-20001011#DocSubsets, 1st paragraph

Received on Tuesday, 7 November 2000 03:12:19 UTC