Changes for bugs 2768 and 2790

3.3.1.2 Typed Value Determination

This section describes how the typed value of an Element or Attribute Node is computed from an element or attribute PSVI information item, where the information item has either a simple type or a complex type with simple content. For other kinds of Element Nodes, see 6.2.4 Construction from a PSVI; for other kinds of Attribute Nodes, see 6.3.4 Construction from a PSVI.

The typed value of Attribute Nodes and some Element Nodes is a sequence of atomic values. The types of the items in the typed value of a node may not be the same as the type of the node itself. This section describes how the typed value of a node is derived from the properties of an information item in a PSVI.

The types of the items in the typed value of a node are determined ~~↓by a recursive process called typed value determination. This↓~~↑as follows. The↑ process begins with T, the schema type of the node itself, as represented in the PSVI. ~~↓The type T has a variety, which is either atomic, union, or list. The typed value determination process is defined as follows:↓~~ ↑For each primitive or ordinary simple type T, the W3C XML Schema specification defines a function M mapping the lexical representation of a value onto the value itself.↑

↑Note: For atomic and list types, the mapping is the “lexical mapping” defined for T in [Schema Part 2]; for union types, the mapping is the lexical mapping defined in [Schema Part 2] modified by the rules in [Schema Part 1] which make it into a function by specifying which value to select when more than one is mapped to by the lexical mapping.↑

↑The typed value is determined as follows:↑

If the nilled property of the node in question is true, then the typed value is the empty sequence.
If T is xs:anySimpleType, the typed value is the [schema normalized value] as an instance of xdt:untypedAtomic.
↑Otherwise, the typed value is the result of applying M to the string value.↑
↓If the {variety} of T is atomic, the typed value is an instance of T derived from the [schema normalized value] in a way consistent with XML Schema validation.↓
↓If the {variety} of T is union, then the type of the typed value is the determined by the type definition that actually validated the content of the node, as follows:
- If [member type definition] exists: If the {name} property exists, the {target namespace} and {name} properties of the [member type definition]; otherwise, the appropriate anonymous type name.
- If [member type definition anonymous] exists: If it is false, the [member type definition namespace] and [member type definition name] properties; otherwise, the appropriate anonymous type name.
The resulting type is substituted for T, and the typed value determination process is invoked recursively.↓
↓If the {variety} of T is list, the [schema normalized value] of the node is considered to be a space-separated list of lexical forms, each of which has its own type. For each of these lexical forms, the type of the corresponding item is found in {item type definition}. This type is then substituted for T, and the typed value determination process is invoked recursively for each member of the list.↓

The typed value determination process is guaranteed to result in a sequence of atomic values, each having a well-defined atomic type. This sequence of atomic values, in turn, determines the typed-value property of the node in the data model.

Possible fix for Bug 2790

Bug 2790 notes that the [type definition] property of the PSVI corresponds to the declared type of an element only most of the time. If the xsi:type attribute is used, the [type definition] property will have as its value the type definition named in the xsi:type attribute. This can lead to unexpected results when nodes in the data model are tested using instance of, in particular for an element E declared with a union type U having members T1 and T2, if one instance of E has xsi:type="T2", then results can be unexpected for:

for $e in /E 
return $e instance of element(E,U)

for $e in /E 
return $e instance of element(E,T2)

This proposal attempts to make the results agree better with expectation by using not the [type definition] property of the element instance to identify the type of the element node, but instead the [type definition] given on the element declaration, when the latter is a union type and the former is one of its members.

The declared type is not used otherwise, since when the type given in xsi:type is actually derived from the declared type, using the declared type would lose potentially useful information.

The XML Query and XSL Working Groups may wish to file a comment against XML Schema asking that the declared type of an element or attribute be given a convenient name in the PSVI.

In passing, the proposal also changes words which suggest (following wording in XML Schema 1.0 which is now generally acknowledged to be misleading) that certain properties may be present or absent in the PSVI. In principle, all properties are necessarily present in the PSVI; they may or may not be accessible through a particular API.

The Working Groups should probably file a bug report against XML Schema 1.0 and 1.1, requesting that the misleading wording should be fixed in 1.1 and in an erratum to 1.0.

3.3.1.1 Element and Attribute Node Type Names

The precise definition of the schema type of an element or attribute information item depends on the properties of the PSVI. In the PSVI, [Schema Part 1] ~~↓only guarantees the existence of either the↓~~ ↑defines a↑ [type definition] property, ~~↓or↓~~ ↑as well as the↑ the [type definition namespace], [type definition name] and [type definition anonymous] properties↑, which are effectively short-cut terms for properties of the type definition↑. ↓If the type definition refers to a union type, there are further properties defined, that refer to the type definition which actually validated the item's normalized value. These properties are not used to determine the schema type of the node but they may be used to determine the typed value of the node, as described in 3.3.1.2 Typed Value Determination.↓ ↑Further, the [element declaration] and [attribute declaration] properties are defined for elements and attributes, respectively. These declarations in turn will identify the [type definition] declared for the element or attribute. To distinguish the [type definition] given in the PSVI for the element or attribute instance from the [type definition] associated with the declaration, the former is referred to below as the actual type and the latter as the declared type of the element or attribute instance in question.↑

The type depends on the ↑declared type, the actual type, and the ↑[validity] and [validation attempted] properties in the PSVI. If:

The [validity] and [validation attempted] properties exist and have the values "valid" and "full", respectively, the schema type of an element or attribute information item is represented by an expanded-QName whose namespace and local name correspond to the first applicable items in the following list:
- ↓If the [type definition] property exists:
  - If the {name} property is not absent, the {target namespace} and {name} properties of the [type definition] property;
  - Otherwise, the namespace and local name of the appropriate anonymous type name.↓
- ↓If [type definition anonymous] exists:
  - If it is false: the [type definition namespace] and the [type definition name] properties;
  - Otherwise, the namespace and local name of the appropriate anonymous type name.↓
- ↑If the declared type is a union and the actual type is (not the same as the declared type, and not a type derived from the declared type, but) one of the member types of the union, or derived from one of its member types:
  - If the {name} property of the declared type is present: the {target namespace} and {name} properties of the declared type.
  - If the {name} property of the declared type is absent: the namespace and local name of the anonymous type name supplied for the declared type.
  ↑
- ↑Otherwise:
  - If [type definition anonymous] is false: the {target namespace} and {name} properties of the actual type.
  - If [type definition anonymous] is true: the namespace and local name of the anonymous type name supplied for the actual type.
  ↑
The [validity] property exists and is "invalid", or the [validation attempted] property exists and is "partial", the schema type of an element is xs:anyType and the type of an attribute is xs:anySimpleType.
The [validity] property exists and is "notKnown", and the [validation attempted] property exists and is "none", the schema type of an element is xdt:untyped and the type of an attribute is xdt:untypedAtomic.
The [validity] or [validation attempted] properties do not exist, the schema type of an element is xdt:untyped and the type of an attribute is xdt:untypedAtomic.

The prefix associated with the type names is implementation-dependent.

Possible changes for bugs 2768 and 2790

Fix for bug 2768

3.3.1.2 Typed Value Determination

Possible fix for Bug 2790

3.3.1.1 Element and Attribute Node Type Names