[Bug 1307] New: [XQuery] Line Endings

http://www.w3.org/Bugs/Public/show_bug.cgi?id=1307

           Summary: [XQuery] Line Endings
           Product: XPath / XQuery / XSLT
           Version: Last Call drafts
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: XQuery
        AssignedTo: chamberl@almaden.ibm.com
        ReportedBy: mike@saxonica.com
         QAContact: public-qt-comments@w3.org


[XQuery] Line Endings

There are two places in XQuery where line endings are normalized: (a) in the
content of direct element constructors, and (b) in string literals.

Within attribute content, the spec refers to the rules for attribute value
normalization in XML 3.3.3. These start with the rule "All line breaks must have
been normalized on input to #xA as described in 2.11 End-of-Line Handling, so
the rest of this algorithm operates on text normalized in this way." This can be
read as implying that XQuery also normalizes line endings in attribute content
(which means that CRLF in attribute content is turned into a single space, not
into two spaces).

(Aside: The XML rules for attribute value normalization depend on the type of
the attribute. I think it would be helpful if we specify that the algorithm is
executed on the basis that the attribute type is CDATA. If it isn't, then schema
validation will take care of it.)

There are contexts in XQuery where line endings are *not* normalized. For
example, they are not normalized in a CDATA section, a direct comment
constructor, or a direct processing instruction constructor. Since the syntax of
each of these constructs is explicitly designed to mimic XML, it seems odd that
the handling of line endings here should differ from XML.

Further, XQuery doesn't normalize line endings appearing in the middle of an
ordinary expression (for example, between "declare" and "function". This doesn't
matter unless XML 1.1-style line endings are used. Currently, if an XQuery
processor decides to follow the XML 1.1 profile, then it allows and normalizes a
NEL character appearing in element content or in a string literal; allows and
doesn't normalize it in a CDATA section, comment, or PI; and disallows it in the
middle of an expression.

I can't see any good reason why XQuery doesn't do the same as XML, and specify
that all line endings in the query are normalized according to the XML 1.0 or
1.1 rules irrespective of the syntactic context, before any lexical or syntactic
analysis of the query starts.

Michael Kay

Received on Thursday, 5 May 2005 09:37:10 UTC