[Bug 5054] Unicode character in K2-StringLT-1 from bugzilla@wiggum.w3.org on 2007-09-18 (public-qt-comments@w3.org from September 2007)

From: <bugzilla@wiggum.w3.org>
Date: Tue, 18 Sep 2007 12:57:54 +0000
To: public-qt-comments@w3.org
CC:
Message-Id: <E1IXceI-0000j0-8d@wiggum.w3.org>

http://www.w3.org/Bugs/Public/show_bug.cgi?id=5054





------- Comment #1 from mike@saxonica.com  2007-09-18 12:57 -------
I think the translation of the query into XQueryX was done incorrectly. From
looking at the file at the octet level, the first operand is the octet sequence
ee a9 a0, the second is f0 91 85 b0. These are the UTF-8 representations of the
characters with codepoints (decimal) 60000 and 70000 respectively. Codepoint
70000 will be represented in UTF-16 as a surrogate pair, and it looks as if
your translation has taken the first 16 bits of the surrogate pair as
representing the entire character.

Received on Tuesday, 18 September 2007 12:57:59 UTC