Re: Xpath transform changes and questions from TAMURA Kent on 2000-03-22 (w3c-ietf-xmldsig@w3.org from January to March 2000)

From: TAMURA Kent <kent@trl.ibm.co.jp>
Date: Wed, 22 Mar 2000 15:11:56 +0900
To: "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org>
CC: "TAMURA Kent" <kent@trl.ibm.co.jp>, "Jonathan Marsh" <jmarsh@microsoft.com>, "Martin J. Duerst" <duerst@w3.org>, "Christopher R. Maden" <crism@exemplary.net>, "James Clark" <jjc@jclark.com>
Message-Id: <200003220611.PAA27814@ns.trl.ibm.com>

John,

In message "Xpath transform changes and questions"
    on 00/03/17, "John Boyer" <jboyer@PureEdge.com> writes:
> i) Serialization of the root node requires that we output the byte order
> mark and xmldecl read by parse() on input.  If parse() is not under our
> control, we cannot specify that it retains this information.  This would
> seem to suggest that root node serialization should result in the empty
> string, which in turn suggests that serialize should output in UTF-8
> regardless of the input encoding.  That would be OK with me.

I prefer serializeing in UTF-8 regardless of the input encoding.
It is not impossible but painful to get the BOM and the XML
declaration with existing XML processors.

If the serialization had to encode in non-UTF encoding, it had
to check whether each character could be encoded in that
encoding.  This check is hard to implement in Java.

You have to specify also output encoding of string type results
of XPath.  A string in XPath is character sequence, not octet
sequence.

-- 
TAMURA Kent @ Tokyo Research Laboratory, IBM

Received on Wednesday, 22 March 2000 01:12:37 UTC