- From: <bugzilla@wiggum.w3.org>
- Date: Wed, 01 Feb 2006 16:14:53 +0000
- To: public-qt-comments@w3.org
- Cc:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=2790
Summary: Instance of with union type results in surprising
results
Product: XPath / XQuery / XSLT
Version: Candidate Recommendation
Platform: PC
OS/Version: Windows XP
Status: NEW
Severity: normal
Priority: P2
Component: Data Model
AssignedTo: Norman.Walsh@Sun.COM
ReportedBy: mrys@microsoft.com
QAContact: public-qt-comments@w3.org
Let's assume that we have
Schema:
declare element E of type U
define type U restricts xs:anySimpleType { T1 | T2 }
define type T1 restricts xs:int
define type T2 restricts xs:string
Instances that are validated according to the schema
<E>42</E>
<E xsi:type="T2">42</E>
The question is what the result of the following queries are:
Q1: for $e in /E
return $e instance of element(E, U)
Q2: for $e in /E
return $e instance of element(E, T1)
Q3: for $e in /E
return $e instance of element(E, T2)
Q4: for $e in /E
return data($e) instance of T1
Q5: for $e in /E
return data($e) instance of T2
Q6: for $e in /E
return data($e) instance of U
Let's look at the validation and data model generation where I still think we
have a need for further clarification.
XSD and PSVI generation: This is not fully clear yet. We all agree that the
types T1 and T2 are not subtypes in the XQuery type system but that they are
member types of the union type U.
This is what I found in the XML Schema document about this type of validation
(and to be honest, I can not clearly understand how this applies to the given
example):
Schema Information Set Contribution: Element Validated by Type
If an element information item is ·valid· with respect to a ·type definition·
as per Element Locally Valid (Type) (§3.3.4), in the ·post-schema-validation
infoset· the item has a property:
PSVI Contributions for element information items
[schema normalized value]
The appropriate case among the following:
1. If clause 3.2 of Element Locally Valid (Element) (§3.3.4) and
Element Default Value (§3.3.5) above have not applied and either
the ·type definition· is a simple type definition or its {content
type} is a simple type definition, then the ·normalized value· of
the item as ·validated·.
2. otherwise ·absent·.
Furthermore, the item has one of the following alternative sets of properties:
Either
PSVI Contributions for element information items
[type definition]
An ·item isomorphic· to the ·type definition· component itself.
[member type definition]
If and only if that type definition is a simple type definition with {variety}
union, or a complex type definition whose {content type} is a simple type
definition with {variety} union, then an ·item isomorphic· to that member of
the union's {member type definitions} which actually ·validated· the element
item's ·normalized value·.
Some of my schema experts think that this means that if xsi:type is given,
only the type given in xsi:type is being preserved for the element's type,
since validation will pick the type given in xsi:type directly and not look at
the union type at all. Let's call that interpretation A.
On the other hand, this seems like it is loosing type information and is in
contradiction to what we expect from the data model document which says:
3.3.1.1 Element and Attribute Node Type Names
The precise definition of the schema type of an element or attribute
information item depends on the properties of the PSVI. In the PSVI, [Schema
Part 1] only guarantees the existence of either the [type definition]
property, or the [type definition namespace], [type definition name] and [type
definition anonymous] properties. If the type definition refers to a union
type, there are further properties defined, that refer to the type definition
which actually validated the item's normalized value. These properties are not
used to determine the schema type of the node but they may be used to
determine the typed value of the node, as described in 3.3.1.2 Typed Value
Determination.
This explanation seems to be clear, but according to interpretation A of the
schema document, you would not have the node's type if an xsi:type value has
been present. But let's assume that interpretation A is wrong and that we can
map the PSVI into the following data model instance (let's call this
interpretation B):
element E of type U{42 of type T1}
element E of type U{"42" of type T2}
Note that according to interpretation A we would get:
element E of type U{42 of type T1}
element E of type T2{"42" of type T2}
Now let's look what the answers should be for Q1 to Q6 given interpretations A
and B:
Q1 - A: true false
Q1 - B: true true
Q2 - A: false false
Q2 - B: false false
Q3 - A: false true
Q3 - B: false false
Q4 - A: true false
Q4 - B: true false
Q5 - A: false true
Q5 - B: false true
Q6: Parse error since U is not an atomic type.
Obviously, from a type consistency point of view, in my personal opinion,
interpretation B is the only one that makes sense. However, interpretation A
seems to be what the schema processor implies according to our reading.
The question is, is interpretation A correct (and therefore schema's semantics
inconsistent) or interpretation B (and therefore the schema spec needs to be
fixed or clarified)?
According to (member-only) http://lists.w3.org/Archives/Member/w3c-xml-query-
wg/2005Dec/0025.html, we need to fix the PSVI to Data model mapping with:
<cite>
So data model construction could/should be fixed to always use the
declared type for the node's type. The only time this will be
different from the [type definition] is when xsi:type has been used.
</cite>
Received on Wednesday, 1 February 2006 16:14:58 UTC