[Bug 5054] Unicode character in K2-StringLT-1

http://www.w3.org/Bugs/Public/show_bug.cgi?id=5054

           Summary: Unicode character in K2-StringLT-1
           Product: XML Query Test Suite
           Version: unspecified
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: XML Query Test Suite
        AssignedTo: frans.englich@telia.com
        ReportedBy: andrew.eisenberg@us.ibm.com
         QAContact: public-qt-comments@w3.org


Test case K2-StringLT-1 contains the comparison of two large codepoints.

I generate the following XQueryX for this test case:

<?xml version="1.0"?>
<xqx:module xmlns:xqx="http://www.w3.org/2005/XQueryX"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xsi:schemaLocation="http://www.w3.org/2005/XQueryX
                                http://www.w3.org/2005/XQueryX/xqueryx.xsd">
  <xqx:mainModule>
    <xqx:queryBody>
      <xqx:ltOp>
        <xqx:firstOperand>
          <xqx:stringConstantExpr>
            <xqx:value>&#60000;</xqx:value>
          </xqx:stringConstantExpr>
        </xqx:firstOperand>
        <xqx:secondOperand>
          <xqx:stringConstantExpr>
            <xqx:value>&#55300;</xqx:value>
          </xqx:stringConstantExpr>
        </xqx:secondOperand>
      </xqx:ltOp>
    </xqx:queryBody>
  </xqx:mainModule>
</xqx:module>


When I attempt to validate this XQueryX, I see this error:

   Character reference "&#55300" is an invalid XML character.


I'm weak on the details of Unicode. I believe that character &#55300 is
&#xD804. I see the following in
http://www.unicode.org/Public/UNIDATA/UnicodeData.txt:

D800;<Non Private Use High Surrogate, First>;Cs;0;L;;;;;N;;;;;
DB7F;<Non Private Use High Surrogate, Last>;Cs;0;L;;;;;N;;;;;

Perhaps you could change &#xD804 to some other character. I've experimented a
bit, and &#xD700; validates just fine.

Received on Monday, 17 September 2007 14:14:23 UTC