- From: <bugzilla@jessica.w3.org>
- Date: Wed, 03 Apr 2013 20:08:38 +0000
- To: public-qt-comments@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=21574 Bug ID: 21574 Summary: [XP3.0] C0 control characters in XPath expressions Classification: Unclassified Product: XPath / XQuery / XSLT Version: Candidate Recommendation Hardware: PC OS: All Status: NEW Severity: normal Priority: P2 Component: XPath 3.0 Assignee: jonathan.robie@gmail.com Reporter: mike@saxonica.com QA Contact: public-qt-comments@w3.org J2.1 item 2 states "Adopted the XML restriction that control characters #x1 to #x1F and 0x7F to 0x9F cannot appear in unescaped form in an XQuery. Resolves Bug 14921." This statement is inappropriate in the XPath specification, and raises the question of what the correct statement should be. Bug 14921, which led to this XQuery change, had nothing to say about XPath. Section 2 Basics says simply: "The basic building block of XPath 3.0 is the expression, which is a string of [Unicode] characters". A1.2, constraint xml-version, says: "XML 1.0 and XML 1.1 differ in their handling of C0 control characters (specifically #x1 through #x1F, excluding #x9, #xA, and #xD) and C1 control characters (#x7F through #x9F). In XML 1.0, these C0 characters are prohibited, and the C1 characters are permitted. In XML 1.1, both sets of control characters are permitted, but only if written as character references. It is RECOMMENDED that implementations should follow the XML 1.1 rules in this respect; however, for backwards compatibility with XPath 2.0, implementations MAY allow C1 control characters to be used directly." This recommendation doesn't make sense for XPath, because character references don't exist in XPath and therefore C0 and C1 characters cannot be written as character references. In practice XPath is often embedded in XML, and characters written as character references will therefore be ordinary Unicode characters by the time the XPath parser sees them. Therefore I propose that XPath expressions should allow any Unicode character legal in XML, subject only to constraints imposed by the host language. -- You are receiving this mail because: You are the QA Contact for the bug.
Received on Wednesday, 3 April 2013 20:08:43 UTC