- From: Dominique Hazaël-Massieux <dom@w3.org>
- Date: Mon, 16 Aug 2004 16:12:16 +0200
- To: david_marston@us.ibm.com, Lofton Henderson <lofton@rockynet.com>
- Cc: www-qa@w3.org
- Message-Id: <1092665536.1400.94.camel@stratustier>
Hi Dave, Le ven 13/08/2004 à 04:52, david_marston@us.ibm.com a écrit : > The basic XML comparitor confirms that two XML documents are equal at > the InfoSet level. Thus, it has to neutralize the order of attributes > and namespace nodes. Would XML Canonicalization fit this requirement? http://www.w3.org/TR/2001/REC-xml-c14n-20010315 """ Any XML document is part of a set of XML documents that are logically equivalent within an application context, but which vary in physical representation based on syntactic changes permitted by XML 1.0 [XML] and Namespaces in XML [Names]. This specification describes a method for generating a physical representation, the canonical form, of an XML document that accounts for the permissible changes. Except for limitations regarding a few unusual cases, if two documents have the same canonical form, then the two documents are logically equivalent within the given application context. """ Examples of implementations of XML Canonicalization are available at: http://www.w3.org/Signature/2000/10/10-c14n-interop.html > In some situations, it would help if it could overlook text nodes > that are all white space. XML Canonicalization doesn't do that, FWIW: """ Retain all whitespace between consecutive start tags, clean or dirty Retain all whitespace between consecutive end tags, clean or dirty Retain all whitespace between end tag/start tag pair, clean or dirty Retain all whitespace in character content, clean or dirty """ http://www.w3.org/TR/2001/REC-xml-c14n-20010315#Example-WhitespaceInContent > A second comparitor is needed to check the output of a product that > implements the Serialization spec [1] because there are requirements > to produce CDATA sections and other details below the InfoSet level. Could you give a few examples of such requirements? > XSLT also produces HTML, and a definitive HTML comparitor would be > welcome. The two inputs would be considered equal if a browser is > required to render them the same way. Hmm... This looks like a dangerous criterion for comparison, since rendering is only one of the way HTML is used; I'm not sure what definitive criterion should be use to compare two HTML documents, although it may be worth looking at the SGML level, since HTML is an SGML language. > I hope there will be some way for the QA Activity to make this happen. > It shouldn't be part of the workload of an individual "substantive" > Working Group. I don't think the current level of resources in the QA Activity would make this possible as of today; but I'm fairly sure other working groups have worked on similar tools that may be reworked or re-used; Lofton, I kind of remember you or Kirill speaking about a test suite doing such a comparison of output during a QA WG face to face meeting; does that evoke anything to you? I looked at the minutes of the Tokyo F2F of the QA WG, but didn't find any relevant detail. Dom > [1] http://www.w3.org/TR/xslt-xquery-serialization/ -- Dominique Hazaël-Massieux - http://www.w3.org/People/Dom/ W3C/ERCIM mailto:dom@w3.org
Received on Monday, 16 August 2004 14:12:18 UTC