Re: XPath transform

> <John>
> Ok, thanks.  Now, I may be reading this wrong, but you're saying it doesn't
> change the input order.  Do you mean that the the attributes are retained in
> the order they originally appeared in the document?

No.  The order by element.getAttributes().item(i) depends on

o the order in which a parser sets attributes to a DOM node, and
o whether a DOM implementation keeps attribute order.

For example, the com.ibm.xml.parser.Parser/TXDocument of XML4J
keeps the order in the original document.  But the
com.ibm.xml.dom.DocumentImpl and the com.ibm.xml.parsers package
always sort attributes which are gotten with
element.getAttributes().item(i).


K> a) Does this mean the character encoding for XPath output other
K> than node-set should be the same as character encoding of the
K> input XML document?  Why?
> <John>
> No, (so I think we agree).  The postprocess c14n will standardize on UTF-8.
> What I was referring to is that the result of the XPath expression, which is
> fed into the postprocess c14n, is only guaranteed to be UTF-8 if we c14n
> preprocess.  If we do not c14n preprocess, then the result of the XPath
> expression may be in UTF-16.  If it is, then it will need a byte order mark
> before we pass it to the c14n postprocess.

I think the preprocess c14n has no benefit in this point.  Even
if the input document is encoded in UTF-8, XPath implementations
normalize the result in an on-memory representation which
represents strings in UTF-16BE or UTF-16LE in Java.


K> b) XML processors do attribute value normalization.  It is done
K> while parsing before constructing SAX events or a DOM tree.
> <John>
> Yes, attribute normalization is done, but the exact rules can vary for some
> attributes based on whether the XML processor is a validating or
> non-validating processor.  The c14n preprocess guarantees that the actions
> associated with a validating processor are performed in all cases.

I feel the guarantee in the c14n is strange.  The best way to
guarantee attribute value normalization is to use a validating
XML processor.  In a situation that an XML processor does not
validate, probably a canonicalizer can not do attribute value
normalization.

-- 
TAMURA Kent @ Tokyo Research Laboratory, IBM

Received on Monday, 31 January 2000 01:42:14 UTC