[XQuery] IBM-XQ-015: validate mode: skip preserve

(IBM-XQ-015) XQuery currently defines three validation modes: strict, lax, 
and skip, based on the three validation modes of XML Schema. In skip mode, 
no validation is applied to a newly-constructed element. Instead, the new 
element node (and each of its descendant elements) is given the annotation 
xdt:untyped, and its attributes (and the attributes of its descendants) 
are given the annotation xdt:untypedAtomic. If the content of the new 
element is copied from existing nodes, the types of these existing nodes 
are lost.

An XQuery implementation that does not support Schema Import will probably 
run in skip-validation mode, since validation is meaningful only if a 
schema is present. Nevertheless, such an implementation may wish to 
preserve the type annotations on nodes in input documents, since these 
type annotations may affect the processing of a query (for example, 17 > 5 
is true for the xs:decimal type but not for the xdt:untypedAtomic type). 

The loss of type information during skip validation causes a serious 
problem for applications that need to "wrap" an element in a higher-level 
"envelope" element. The wrapping is done by referencing the "content" 
element inside a constructor for the "envelope" element, causing the 
content element to be copied and validated. It is quite possible that the 
"content" element may not be defined in the in-scope element declarations. 
This may happen if the current application is a generic message-routing 
application that does not find it practical to import the schemas for all 
possible contents. It will also happen in systems that do not implement 
the Schema Import feature. In these cases, skip-validation causes the loss 
of the type information on the "content" element. 

Here are some examples of this problem (assuming skip validation in each 
case):

(a) Copy a set of "customer" elements into newly-constructed 
"rated-customer" elements, pairing each customer with a rating. Now order 
all the rated-customers by balance-due. Balance-due was originally 
decimal, but now its type has been lost and 5 is sorted as greater than 
17.

(b) Write an application to extract data from an XML document and wrap it 
in <row> and <col> tags for interfacing to a relational database. By 
wrapping the data in <row> and <col> tags, its original types are 
destroyed and all the data appears to be text. Again, data that was 
originally decimal will be sorted incorrectly.

(c) If a query-rewrite pushes a predicate inside a constructor, the effect 
of the predicate is changed because the expression inside the constructor 
is typed but outside the constructor it is not. This limits the ability of 
the system to do query optimization and to merge queries with view 
definitions.

The solution to these problems is to introduce a new validation mode 
called "skip preserve", or simply "preserve". In this mode, no validation 
is attempted, and the type annotation of the subject element remains 
unchanged rather than being set to xdt:untyped. Adding this validation 
mode would not affect the definitions of the existing three modes. 

The following changes would be made to the XQuery specification by this 
proposal:

(a) In Section 2.1.1, Static Context: In definition of Validation Mode, 
add "preserve"  or "skip preserve" to the list of modes.

(b) In the grammar production for ValidationMode, add the a keyword for 
the new option. 

(c) In Section 3.7.1.3, Direct Element Constructors--Content: Rule (1d) 
should be changed as follows: "If the validation mode is "preserve", 
copied element and attribute nodes retain their original type annotations; 
otherwise, copied element nodes are given the type annotation xdt:untyped, 
and copied attribute nodes are given the type annotation 
xdt:untypedAtomic."

(d) In Section 3.7.1.5, Direct Element Constructors--Type of a Constructed 
Element: Add the following initial sentence:
"A direct element constructor assigns the initial type annotation 
xdt:untyped to the newly constructed element node. It then validates the 
new node, using the schema validation process defined in XML Schema."
Also in Section 3.7.1.5, change the first bullet as follows: "If 
validation mode = skip or preserve, no validation is attempted. The 
constructed element retains its type annotation of xdt:untyped, and its 
attributes and descendants retain the type annotations assigned to them 
during construction."

(e) In Section 3.14, Validate Expressions: Add the following bullet to the 
three bullets that define strict, lax, and skip validation: 
"preserve indicates that no validation is to be attempted, but that 
element nodes and attribute nodes are to retain their original type 
annotations."

(f) In Section 4.6, Validation Declaration: Add "preserve" to the list of 
validation modes.

Note that these changes will align XQuery with XSLT 2.0, which has already 
introduced the concept validation="preserve" as documented in 
http://www.w3.org/TR/xslt20/#validating-constructed-nodes. The XSLT 2.0 
definition of validation="preserve" is consistent with the definition 
above, and these definitions should be kept consistent.

--Don Chamberlin

Received on Wednesday, 11 February 2004 18:52:20 UTC