Re: XPath transform

From:  TAMURA Kent <kent@trl.ibm.co.jp>
Resent-Date:  Mon, 31 Jan 2000 01:42:19 -0500 (EST)
Resent-Message-Id:  <200001310642.BAA23069@www19.w3.org>
Date:  Mon, 31 Jan 2000 15:41:34 +0900
Message-Id:  <200001310641.PAA30482@ns.trl.ibm.com>
References:  <NDBBLAOMJKOFPMBCHJOIOECNCDAA.jboyer@uwi.com>
To:  <w3c-ietf-xmldsig@w3.org>
In-reply-to:  "John Boyer"'s message of "Wed, 26 Jan 2000 10:02:59 -0800"	<NDBB
LAOMJKOFPMBCHJOIOECNCDAA.jboyer@uwi.com>
User-Agent:  SEMI/1.13.5 (=?ISO-8859-4?Q?Meih=F2?=) FLIM/1.13.2 (Kasanui)
	     Emacs/20.4 (i386-*-nt4.0.1381) MULE/4.1 (AOI) Meadow/1.10 (TSUYU)

>> <John>
>> Ok, thanks.  Now, I may be reading this wrong, but you're saying it doesn't
>> change the input order.  Do you mean that the the attributes are retained in
>> the order they originally appeared in the document?
>
>No.  The order by element.getAttributes().item(i) depends on
>
>o the order in which a parser sets attributes to a DOM node, and
>o whether a DOM implementation keeps attribute order.
>
>For example, the com.ibm.xml.parser.Parser/TXDocument of XML4J
>keeps the order in the original document.  But the
>com.ibm.xml.dom.DocumentImpl and the com.ibm.xml.parsers package
>always sort attributes which are gotten with
>element.getAttributes().item(i).

It seems to me that we keep coming back to the current implementations
of XPath being just from the point of view of supporting XSLT and
XPointer.  So like everything else, except Canonicalization, XPath
implementations feel free to arbitrarily change attribute ordering.

Seems to me we need to either (1) yet further increase our special
rules which are really defining a new thingm, DSigXPath, that happens
to use the same path expressions as XPath, or (2) try to minimize
special rules and use the XML tools as the rest of the XML world
provides them.  That is, if we have XPath, we should add the minimum
rules so that non-string output, including a node set, is coverted to
a reasonable byte string.  We should not require pre=canonicalization
but should explain where it might help.  We should not require
post-canonicalization but should explain where it might help.

>K> a) Does this mean the character encoding for XPath output other
>K> than node-set should be the same as character encoding of the
>K> input XML document?  Why?
>> <John>
>> No, (so I think we agree).  The postprocess c14n will standardize on UTF-8.
>> What I was referring to is that the result of the XPath expression, which is
>> fed into the postprocess c14n, is only guaranteed to be UTF-8 if we c14n
>> preprocess.  If we do not c14n preprocess, then the result of the XPath
>> expression may be in UTF-16.  If it is, then it will need a byte order mark
>> before we pass it to the c14n postprocess.
>
>I think the preprocess c14n has no benefit in this point.  Even
>if the input document is encoded in UTF-8, XPath implementations
>normalize the result in an on-memory representation which
>represents strings in UTF-16BE or UTF-16LE in Java.

If I recall correctly, XSLT has explicit provision for speciying the
output charset.  XPath does not.  As I've said before, if we have an
XPath Transform, a lot of XML people are going to yell at us that XSLT
was designed to produce a document.  XPath was designed to be a
component of XSLT and XPointer.

>K> b) XML processors do attribute value normalization.  It is done
>K> while parsing before constructing SAX events or a DOM tree.
>> <John>
>> Yes, attribute normalization is done, but the exact rules can vary for some
>> attributes based on whether the XML processor is a validating or
>> non-validating processor.  The c14n preprocess guarantees that the actions
>> associated with a validating processor are performed in all cases.
>
>I feel the guarantee in the c14n is strange.  The best way to
>guarantee attribute value normalization is to use a validating
>XML processor.  In a situation that an XML processor does not
>validate, probably a canonicalizer can not do attribute value
>normalization.

Some aspects of attribute normalization can be done without a DTD but
probably all parsers do them.  The remaining aspects that are DTD
dependent could be done for elements we define by a post procesoor
working over the DOM tree or something assuming this postprocessor
was Signature knowledgeable.

>-- 
>TAMURA Kent @ Tokyo Research Laboratory, IBM

Donald
===================================================================
 Donald E. Eastlake 3rd                    dee3@torque.pothole.com
 65 Shindegan Hill Rd, RR#1            lde008@noah.dma.isg.mot.com
 Carmel, NY 10512 USA     +1 914-276-2668(h)    +1 508-261-5434(w)

Received on Monday, 31 January 2000 08:50:17 UTC