[XQuery] I18N last call comments from Martin Duerst on 2004-02-16 (public-qt-comments@w3.org from February 2004)

From: Martin Duerst <duerst@w3.org>
Date: Sun, 15 Feb 2004 19:16:33 -0500
To: public-qt-comments@w3.org
Message-Id: <4.2.0.58.J.20040215172016.05748990@localhost>
Dear XML Query WG,

Below please find the I18N WGs comments on your last call document
"XQuery 1.0: An XML Query Language"
(http://www.w3.org/TR/2003/WD-xquery-20031112/).

Please note the following:
- These comments have not yet been approved by the I18N WG. Please
   treat them as personal comments (unless and) until they are approved
   by the I18N WG.
- Please address all replies to these comments to the I18N IG mailing
   list (w3c-i18n-ig@w3.org), not just to me.
- The comments are numbered in square brackets [nn].

[1] As shown in the examples in 2.6.6 and 2.6.7, XQuery code can
     exist in documents/files of its own. It is therefore crucial
     that the XQuery version declaration (4.1) takes an additional
     parameter, 'encoding', in the same way as an XML declaration
     has an 'encoding' pseudo-attribute.
     For its definition, most details can be taken from XML. However,
     it would be great if the whitespace in the version declaration
     were limited to exactly one space each, because the encoding
     parameter has to be looked at before a full parser is available.

[2] In connection with [1], XQuery also needs a mime type. Most probably
     application/xquery.

[3] 3.1.1 Literals, entity/character references: After careful examination,
     this works out. But it would be good to have a section explaining
     how character escaping works in XQuery overall, including differences
     and similarities to XML and XPath.

[4] The special conventions for escaping quotes (production [17]),
     apostrophes ([25]), and curly braces (should probably also be
     a production of its own) may not be necessary. Character
     references should be used, for convenience, named character
     references for { and } could be defined.

[5] 3.7.1 and other places: how can a base URI be preserved? How
     can it be set in the output? Both have to be possible, otherwise,
     XML Base is not really useful. Also, there should be a way to
     take an IRI and make it absolute, using the relevant XML Base.

[6] How can xml:lang be extracted from data and preserved with a query?
     How can this be done without littering all elements with unnecessary
     xml:lang attributes? Other inherited attributes will have the same
     problem; some better support for inherited attributes seems necessary.

[7] 3.7.1.1 Attributes: processing is described differently from what
     happens in the case of XML. Whitespace normalization is done before
     resolution of character references in XML; character references
     can be used to protect various whitespace characters from attribute
     normalization. This should be alligned with XML. (also, it should
     be mentioned how this normalization is (or is not) dependent on
     the type of the attribute)

[8] 3.7.1.3 Content (and other places): serializing atomic values
     by inserting spaces may not be appropriate for Chinese, Japanese,
     Thai,..., i.e. languages that don't use spaces between words.
     This has to be checked very carefully.

[9] There should be more non-US examples. For example, it is very
     difficult for somebody not from the US to understand why there
     are no Deep Sea Fishermen in Nebraska.

[10] 3.7.2: Not requiring CDATA constructs to be serialized as CDATA
      sections is a good idea, because it helps dispell the idea
      that CDATA sections are semantically significant.

[11] 3.7.3.1, example using 'lang' attribute: Please replace this
      attribute with xml:lang, and its values with 'de' and 'it'.

[12] 3.7.3.4: Why is there a need for a 'text' node constructor.
      What's the difference between this and a string (there should
      be none, or as few as possible).

[13] For collations, namespaces, schemas, and so on, a "StringLiteral"
      rule is used, and the definitions say 'URI'. This has to be changed
      to IRI, and preferably a separate non-terminal should be used
      to make this clear. There should also be a clear indication
      how XML Base affects these.

[14] 3.8.3, last example: Instead of 'collation "eng-us"', something
      that looks more like an URI should be used.

[15] 3.12.2 Typeswitch: There should be an example that shows how to
      deal with strings, complex types without any actual markup contained,
      and complex types with markup (e.g. <ruby> or similar).

[16] section 4: As shown in the examples in 2.6.6 and 2.6.7, XQuery can
      directly produce XML output. For such cases, it is very important
      to make sure that the relevant parameters for serialization
      (in particular encoding, but also normalization) can be defined
      in an XQuery prolog. There should also be clear requirements on
      minimal support for encodings (e.g. UTF-8 and UTF-16) to guarantee
      interoperability.

[17] Note at the end of 4.6: re. DTD treatment, this should very clearly
      say what happens (or doesn't happen) with entities, or point to the
      place where this is defined (data model)?

[18] it would be very good if it were possible to declare default
      collations for part of an XQuery.

[19] There should be a way to character normalize nodes (not only strings).
      This could easily be achieved by overloading fn:normalize-unicode.
      This will help in cases where otherwise fn:normalize-unicode would
      have to be used all over the place.

[20] The XML version for output seems to be fixed to 1.0. There needs
      to be a way to output XML 1.1.


Regards,   Martin.
Received on Sunday, 15 February 2004 19:16:47 UTC