RE: [XQuery] IBM-XQ-015: validate mode: skip preserve

MessageI think the issue I raised previously [Serialization (sometimes)
needs to include type information] is strongly related to this one. While
preserving original type information during element construction is clearly
very important in some cases, the logical follow-up is to preserve type
information during serialization as well. In many cases, this can be done by
defining the appropriate schema for the result tree, but in the cases I
pointed out, it can't be done that way.

Best regards,

Antoine Mensch
  -----Message d'origine-----
  De : public-qt-comments-request@w3.org
[mailto:public-qt-comments-request@w3.org]De la part de Michael Kay
  Envoye : jeudi 12 fevrier 2004 01:58
  A : public-qt-comments@w3.org
  Objet : RE: [XQuery] IBM-XQ-015: validate mode: skip preserve


  For information, XSLT has these four modes (it calls them strict, lax,
strip, and preserve) for much the reasons outlined.

  Michael Kay
    -----Original Message-----
    From: public-qt-comments-request@w3.org
[mailto:public-qt-comments-request@w3.org] On Behalf Of Don Chamberlin
    Sent: 11 February 2004 23:52
    To: public-qt-comments@w3.org
    Subject: [XQuery] IBM-XQ-015: validate mode: skip preserve



    (IBM-XQ-015) XQuery currently defines three validation modes: strict,
lax, and skip, based on the three validation modes of XML Schema. In skip
mode, no validation is applied to a newly-constructed element. Instead, the
new element node (and each of its descendant elements) is given the
annotation xdt:untyped, and its attributes (and the attributes of its
descendants) are given the annotation xdt:untypedAtomic. If the content of
the new element is copied from existing nodes, the types of these existing
nodes are lost.

    An XQuery implementation that does not support Schema Import will
probably run in skip-validation mode, since validation is meaningful only if
a schema is present. Nevertheless, such an implementation may wish to
preserve the type annotations on nodes in input documents, since these type
annotations may affect the processing of a query (for example, 17 > 5 is
true for the xs:decimal type but not for the xdt:untypedAtomic type).

    The loss of type information during skip validation causes a serious
problem for applications that need to "wrap" an element in a higher-level
"envelope" element. The wrapping is done by referencing the "content"
element inside a constructor for the "envelope" element, causing the content
element to be copied and validated. It is quite possible that the "content"
element may not be defined in the in-scope element declarations. This may
happen if the current application is a generic message-routing application
that does not find it practical to import the schemas for all possible
contents. It will also happen in systems that do not implement the Schema
Import feature. In these cases, skip-validation causes the loss of the type
information on the "content" element.

    Here are some examples of this problem (assuming skip validation in each
case):

    (a) Copy a set of "customer" elements into newly-constructed
"rated-customer" elements, pairing each customer with a rating. Now order
all the rated-customers by balance-due. Balance-due was originally decimal,
but now its type has been lost and 5 is sorted as greater than 17.

    (b) Write an application to extract data from an XML document and wrap
it in <row> and <col> tags for interfacing to a relational database. By
wrapping the data in <row> and <col> tags, its original types are destroyed
and all the data appears to be text. Again, data that was originally decimal
will be sorted incorrectly.

    (c) If a query-rewrite pushes a predicate inside a constructor, the
effect of the predicate is changed because the expression inside the
constructor is typed but outside the constructor it is not. This limits the
ability of the system to do query optimization and to merge queries with
view definitions.

    The solution to these problems is to introduce a new validation mode
called "skip preserve", or simply "preserve". In this mode, no validation is
attempted, and the type annotation of the subject element remains unchanged
rather than being set to xdt:untyped. Adding this validation mode would not
affect the definitions of the existing three modes.

    The following changes would be made to the XQuery specification by this
proposal:

    (a) In Section 2.1.1, Static Context: In definition of Validation Mode,
add "preserve"  or "skip preserve" to the list of modes.

    (b) In the grammar production for ValidationMode, add the a keyword for
the new option.

    (c) In Section 3.7.1.3, Direct Element Constructors--Content: Rule (1d)
should be changed as follows: "If the validation mode is "preserve", copied
element and attribute nodes retain their original type annotations;
otherwise, copied element nodes are given the type annotation xdt:untyped,
and copied attribute nodes are given the type annotation xdt:untypedAtomic."

    (d) In Section 3.7.1.5, Direct Element Constructors--Type of a
Constructed Element: Add the following initial sentence:
    "A direct element constructor assigns the initial type annotation
xdt:untyped to the newly constructed element node. It then validates the new
node, using the schema validation process defined in XML Schema."
    Also in Section 3.7.1.5, change the first bullet as follows: "If
validation mode = skip or preserve, no validation is attempted. The
constructed element retains its type annotation of xdt:untyped, and its
attributes and descendants retain the type annotations assigned to them
during construction."

    (e) In Section 3.14, Validate Expressions: Add the following bullet to
the three bullets that define strict, lax, and skip validation:
    "preserve indicates that no validation is to be attempted, but that
element nodes and attribute nodes are to retain their original type
annotations."

    (f) In Section 4.6, Validation Declaration: Add "preserve" to the list
of validation modes.

    Note that these changes will align XQuery with XSLT 2.0, which has
already introduced the concept validation="preserve" as documented in
http://www.w3.org/TR/xslt20/#validating-constructed-nodes. The XSLT 2.0
definition of validation="preserve" is consistent with the definition above,
and these definitions should be kept consistent.

    --Don Chamberlin

Received on Thursday, 12 February 2004 03:14:22 UTC