Re: [XQuery] IBM-XQ-015: validate mode: skip preserve

I agree that this is useful and important.

Jonathan

Michael Kay wrote:

> For information, XSLT has these four modes (it calls them strict, lax, 
> strip, and preserve) for much the reasons outlined.
>  
> Michael Kay
>
>     -----Original Message-----
>     *From:* public-qt-comments-request@w3.org
>     [mailto:public-qt-comments-request@w3.org] *On Behalf Of *Don
>     Chamberlin
>     *Sent:* 11 February 2004 23:52
>     *To:* public-qt-comments@w3.org
>     *Subject:* [XQuery] IBM-XQ-015: validate mode: skip preserve
>
>
>     (IBM-XQ-015) XQuery currently defines three validation modes:
>     strict, lax, and skip, based on the three validation modes of XML
>     Schema. In skip mode, no validation is applied to a
>     newly-constructed element. Instead, the new element node (and each
>     of its descendant elements) is given the annotation xdt:untyped,
>     and its attributes (and the attributes of its descendants) are
>     given the annotation xdt:untypedAtomic. If the content of the new
>     element is copied from existing nodes, the types of these existing
>     nodes are lost.
>
>     An XQuery implementation that does not support Schema Import will
>     probably run in skip-validation mode, since validation is
>     meaningful only if a schema is present. Nevertheless, such an
>     implementation may wish to preserve the type annotations on nodes
>     in input documents, since these type annotations may affect the
>     processing of a query (for example, 17 > 5 is true for the
>     xs:decimal type but not for the xdt:untypedAtomic type).
>
>     The loss of type information during skip validation causes a
>     serious problem for applications that need to "wrap" an element in
>     a higher-level "envelope" element. The wrapping is done by
>     referencing the "content" element inside a constructor for the
>     "envelope" element, causing the content element to be copied and
>     validated. It is quite possible that the "content" element may not
>     be defined in the in-scope element declarations. This may happen
>     if the current application is a generic message-routing
>     application that does not find it practical to import the schemas
>     for all possible contents. It will also happen in systems that do
>     not implement the Schema Import feature. In these cases,
>     skip-validation causes the loss of the type information on the
>     "content" element.
>
>     Here are some examples of this problem (assuming skip validation
>     in each case):
>
>     (a) Copy a set of "customer" elements into newly-constructed
>     "rated-customer" elements, pairing each customer with a rating.
>     Now order all the rated-customers by balance-due. Balance-due was
>     originally decimal, but now its type has been lost and 5 is sorted
>     as greater than 17.
>
>     (b) Write an application to extract data from an XML document and
>     wrap it in <row> and <col> tags for interfacing to a relational
>     database. By wrapping the data in <row> and <col> tags, its
>     original types are destroyed and all the data appears to be text.
>     Again, data that was originally decimal will be sorted incorrectly.
>
>     (c) If a query-rewrite pushes a predicate inside a constructor,
>     the effect of the predicate is changed because the expression
>     inside the constructor is typed but outside the constructor it is
>     not. This limits the ability of the system to do query
>     optimization and to merge queries with view definitions.
>
>     The solution to these problems is to introduce a new validation
>     mode called "skip preserve", or simply "preserve". In this mode,
>     no validation is attempted, and the type annotation of the subject
>     element remains unchanged rather than being set to xdt:untyped.
>     Adding this validation mode would not affect the definitions of
>     the existing three modes.
>
>     The following changes would be made to the XQuery specification by
>     this proposal:
>
>     (a) In Section 2.1.1, Static Context: In definition of Validation
>     Mode, add "preserve"  or "skip preserve" to the list of modes.
>
>     (b) In the grammar production for ValidationMode, add the a
>     keyword for the new option.
>
>     (c) In Section 3.7.1.3, Direct Element Constructors--Content: Rule
>     (1d) should be changed as follows: "If the validation mode is
>     "preserve", copied element and attribute nodes retain their
>     original type annotations; otherwise, copied element nodes are
>     given the type annotation xdt:untyped, and copied attribute nodes
>     are given the type annotation xdt:untypedAtomic."
>
>     (d) In Section 3.7.1.5, Direct Element Constructors--Type of a
>     Constructed Element: Add the following initial sentence:
>     "A direct element constructor assigns the initial type annotation
>     xdt:untyped to the newly constructed element node. It then
>     validates the new node, using the schema validation process
>     defined in XML Schema."
>     Also in Section 3.7.1.5, change the first bullet as follows: "If
>     validation mode = skip or preserve, no validation is attempted.
>     The constructed element retains its type annotation of
>     xdt:untyped, and its attributes and descendants retain the type
>     annotations assigned to them during construction."
>
>     (e) In Section 3.14, Validate Expressions: Add the following
>     bullet to the three bullets that define strict, lax, and skip
>     validation:
>     "preserve indicates that no validation is to be attempted, but
>     that element nodes and attribute nodes are to retain their
>     original type annotations."
>
>     (f) In Section 4.6, Validation Declaration: Add "preserve" to the
>     list of validation modes.
>
>     Note that these changes will align XQuery with XSLT 2.0, which has
>     already introduced the concept validation="preserve" as documented
>     in http://www.w3.org/TR/xslt20/#validating-constructed-nodes. The
>     XSLT 2.0 definition of validation="preserve" is consistent with
>     the definition above, and these definitions should be kept
>     consistent.
>
>     --Don Chamberlin
>

Received on Thursday, 12 February 2004 12:27:33 UTC