[Bug 2611] xqueryx: trivial embedding (esp CDATA sections)

http://www.w3.org/Bugs/Public/show_bug.cgi?id=2611

           Summary: xqueryx: trivial embedding (esp CDATA sections)
           Product: XPath / XQuery / XSLT
           Version: Candidate Recommendation
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: XQueryX
        AssignedTo: jim.melton@acm.org
        ReportedBy: davidc@nag.co.uk
         QAContact: public-qt-comments@w3.org


section 5 says

  If the XQuery contains characters that are prohibited in XML text
  (specifically < and &), except when they occur within a CDATA section
  within the XQuery, they must be "escaped" as either character entity
  references (&lt; and &amp;, respectively) or numeric character references

I think that the "except when they occur within a CDATA section within the XQuery"
should be deleted and that all "<" including those within CDATA sections (and
including the < in <![CDATA[ in such a section) should be escaped.
In addition there is a third possibility for escaping besides entity or
character references, namely to use CDATA sections, and in fact this possibility
is demonstrated in the last example.

It goes on to say:

  CDATA sections within an XQuery expression are embedded in the same form in
which they appear in any XML document.

I am not at all sure what this is intended to mean. Perhaps it is intended to
mean that XQuery CDATA sections are encoded as XML CDATA sections. In which case
I think that is completely wrong and means that this is a not-so-trivial
embedding. The Trivial embedding should take the xquery text as plain text and
embed it into XML using standard plain text to XML constructs, without having to
parse the xquery expression. (The plain text xml serialiser has to scan for <>
and & but not parse the expression.)
The Xquery <x><![CDATA[<]]></x> should be encoded as
<xqx:xquery>&lt;x&gt;&lt;![CDATA[&lt;]]&gt;&lt;/x&gt;<xqx:xquery>
not
<xqx:xquery>&lt;x&gt;<![CDATA[<]]>&lt;/x&gt;<xqx:xquery>
as this latter embedding is an embedding of the xquery
<x>&lt;</x>
which has the same run time behaviour as the first expression but it is a
different expression with a different parse tree.
It's important not to lose the fact that the CDATA section was in the XQuery as
although this example has the same behaviour if it is replaced, in other cases
it may be different, due to white space stripping (which is suppressed by CDATA
sections).

  it is recommended that > always be "escaped" (for example, as &gt; or &#3E;).

there's a missing x in the hex character ref at the end of that sentence.

Received on Tuesday, 20 December 2005 01:17:22 UTC