RE: [XQuery] IBM-XQ-015: validate mode: skip preserve

For information, XSLT has these four modes (it calls them strict, lax,
strip, and preserve) for much the reasons outlined.
 
Michael Kay

-----Original Message-----
From: public-qt-comments-request@w3.org
[mailto:public-qt-comments-request@w3.org] On Behalf Of Don Chamberlin
Sent: 11 February 2004 23:52
To: public-qt-comments@w3.org
Subject: [XQuery] IBM-XQ-015: validate mode: skip preserve



(IBM-XQ-015) XQuery currently defines three validation modes: strict,
lax, and skip, based on the three validation modes of XML Schema. In
skip mode, no validation is applied to a newly-constructed element.
Instead, the new element node (and each of its descendant elements) is
given the annotation xdt:untyped, and its attributes (and the attributes
of its descendants) are given the annotation xdt:untypedAtomic. If the
content of the new element is copied from existing nodes, the types of
these existing nodes are lost. 

An XQuery implementation that does not support Schema Import will
probably run in skip-validation mode, since validation is meaningful
only if a schema is present. Nevertheless, such an implementation may
wish to preserve the type annotations on nodes in input documents, since
these type annotations may affect the processing of a query (for
example, 17 > 5 is true for the xs:decimal type but not for the
xdt:untypedAtomic type). 

The loss of type information during skip validation causes a serious
problem for applications that need to "wrap" an element in a
higher-level "envelope" element. The wrapping is done by referencing the
"content" element inside a constructor for the "envelope" element,
causing the content element to be copied and validated. It is quite
possible that the "content" element may not be defined in the in-scope
element declarations. This may happen if the current application is a
generic message-routing application that does not find it practical to
import the schemas for all possible contents. It will also happen in
systems that do not implement the Schema Import feature. In these cases,
skip-validation causes the loss of the type information on the "content"
element. 

Here are some examples of this problem (assuming skip validation in each
case): 

(a) Copy a set of "customer" elements into newly-constructed
"rated-customer" elements, pairing each customer with a rating. Now
order all the rated-customers by balance-due. Balance-due was originally
decimal, but now its type has been lost and 5 is sorted as greater than
17. 

(b) Write an application to extract data from an XML document and wrap
it in <row> and <col> tags for interfacing to a relational database. By
wrapping the data in <row> and <col> tags, its original types are
destroyed and all the data appears to be text. Again, data that was
originally decimal will be sorted incorrectly. 

(c) If a query-rewrite pushes a predicate inside a constructor, the
effect of the predicate is changed because the expression inside the
constructor is typed but outside the constructor it is not. This limits
the ability of the system to do query optimization and to merge queries
with view definitions. 

The solution to these problems is to introduce a new validation mode
called "skip preserve", or simply "preserve". In this mode, no
validation is attempted, and the type annotation of the subject element
remains unchanged rather than being set to xdt:untyped. Adding this
validation mode would not affect the definitions of the existing three
modes. 

The following changes would be made to the XQuery specification by this
proposal: 

(a) In Section 2.1.1, Static Context: In definition of Validation Mode,
add "preserve"  or "skip preserve" to the list of modes. 

(b) In the grammar production for ValidationMode, add the a keyword for
the new option. 

(c) In Section 3.7.1.3, Direct Element Constructors--Content: Rule (1d)
should be changed as follows: "If the validation mode is "preserve",
copied element and attribute nodes retain their original type
annotations; otherwise, copied element nodes are given the type
annotation xdt:untyped, and copied attribute nodes are given the type
annotation xdt:untypedAtomic." 

(d) In Section 3.7.1.5, Direct Element Constructors--Type of a
Constructed Element: Add the following initial sentence: 
"A direct element constructor assigns the initial type annotation
xdt:untyped to the newly constructed element node. It then validates the
new node, using the schema validation process defined in XML Schema." 
Also in Section 3.7.1.5, change the first bullet as follows: "If
validation mode = skip or preserve, no validation is attempted. The
constructed element retains its type annotation of xdt:untyped, and its
attributes and descendants retain the type annotations assigned to them
during construction." 

(e) In Section 3.14, Validate Expressions: Add the following bullet to
the three bullets that define strict, lax, and skip validation: 
"preserve indicates that no validation is to be attempted, but that
element nodes and attribute nodes are to retain their original type
annotations." 

(f) In Section 4.6, Validation Declaration: Add "preserve" to the list
of validation modes. 

Note that these changes will align XQuery with XSLT 2.0, which has
already introduced the concept validation="preserve" as documented in
http://www.w3.org/TR/xslt20/#validating-constructed-nodes. The XSLT 2.0
definition of validation="preserve" is consistent with the definition
above, and these definitions should be kept consistent. 

--Don Chamberlin

Received on Wednesday, 11 February 2004 19:58:05 UTC