W3C home > Mailing lists > Public > public-qt-comments@w3.org > October 2016

[Bug 29962] New: [XP31] Legal XML Unicode character

From: <bugzilla@jessica.w3.org>
Date: Sat, 29 Oct 2016 03:51:02 +0000
To: public-qt-comments@w3.org
Message-ID: <bug-29962-523@http.www.w3.org/Bugs/Public/>

            Bug ID: 29962
           Summary: [XP31] Legal XML Unicode character
           Product: XPath / XQuery / XSLT
           Version: Candidate Recommendation
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: normal
          Priority: P2
         Component: XPath 3.1
          Assignee: jonathan.robie@gmail.com
          Reporter: abel.braaksma@xs4all.nl
        QA Contact: public-qt-comments@w3.org
  Target Milestone: ---

Section A.1.2 (Extra-grammatical Constraints) in the subsection on xml-version
[1], we have the closing sentence:

"XPath expressions allow any legal XML Unicode character, subject only to
constraints imposed by the host language."

But we don't define XML Unicode character (it occurs once in XP31), and that
term does not exist in the XML specification.

I would assume it is the Char production. But it could also be all Unicode
characters except NULL and surrogate pairs (or something like that).

Note that the Char production in XML itself says "any unicode character except
...", but this comment is not complete (the production shows otherwise) and
therefore ambiguous[2].

If XPath is used in a host language like XSLT it is naturally restricted by XML
itself, but if it is used outside such context, the limitation should be

My suggestion would be to say:

"XPath expressions allow any legal Unicode character except 0000, FFFE, FFFF
and surrogate blocks, subject only to constraints imposed by the host

This would define XPath expressions character ranges to be wider than the XML
1.0 character range, but many of these excluded characters can appear
entity-escaped in XML 1.1. And escaping is out of scope for XPath itself

[2] https://www.w3.org/TR/REC-xml/#charsets

You are receiving this mail because:
You are the QA Contact for the bug.
Received on Saturday, 29 October 2016 03:51:13 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:58:02 UTC