Bug 2768 notes that the current description of typed value determination relies on named properties in the PSVI, and that the named properties used are not actually guaranteed to be defined in the PSVI for all recursive cases. That is, the PSVI does not define terms for all of the properties which are known in the course of validation and which are required for mapping a PSVI into a data-model instance.
To avoid this problem, the proposal below appeals not to specific named properties in the PSVI but to the fundamental notion of the relation mapping lexical representations into simple values, and to the rules given in Schema Part 1 for making that relation into a function in context.
The XSL and XML Query Working Groups may wish to file a comment against XML Schema pointing to the absence of PSVI information items for the individual values in a list of values, and the absence of defined PSVI property names for the information needed for the PSVI-to-XDM mapping.
This section describes how the typed value of an Element or Attribute Node is computed from an element or attribute PSVI information item, where the information item has either a simple type or a complex type with simple content. For other kinds of Element Nodes, see 6.2.4 Construction from a PSVI; for other kinds of Attribute Nodes, see 6.3.4 Construction from a PSVI.
The typed value of Attribute Nodes and some Element Nodes is a sequence of atomic values. The types of the items in the typed value of a node may not be the same as the type of the node itself. This section describes how the typed value of a node is derived from the properties of an information item in a PSVI.
The types of the items in the typed value of a node are determined
↓by a recursive process called typed value determination. This↓↑as
follows. The↑ process
begins with T
, the schema type of the node itself, as
represented in the PSVI. ↓The type
↑For each primitive or ordinary simple type T
has a variety, which
is either atomic, union, or list. The typed value determination
process is defined as follows:↓T
, the W3C XML Schema
specification defines a function M
mapping the lexical representation of
a value onto the value itself.↑
↑Note: For atomic and list types, the mapping is the “lexical
mapping” defined for T
in [Schema
Part 2]; for union types, the mapping is the lexical mapping defined
in [Schema Part 2] modified by the rules in
[Schema Part 1] which make it into a function
by specifying which value to select when more than one is mapped to by
the lexical mapping.↑
↑The typed value is determined as follows:↑
If the nilled property of the node in question is
true
, then the typed value is the empty sequence.
If T
is xs:anySimpleType
, the typed value
is the [schema normalized value] as an instance of
xdt:untypedAtomic
.
↑Otherwise, the typed value is the result of applying M
to the
string value.↑
↓If the {variety} of T
is atomic, the typed value is an
instance of T
derived from the [schema normalized
value] in a way consistent with XML Schema validation.↓
↓If the {variety} of T
is union, then the type of the
typed value is the determined by the type definition that actually
validated the content of the node, as follows:
If [member type definition] exists: If the {name} property exists, the {target namespace} and {name} properties of the [member type definition]; otherwise, the appropriate anonymous type name.
If [member type definition anonymous] exists: If it is false, the [member type definition namespace] and [member type definition name] properties; otherwise, the appropriate anonymous type name.
The resulting type is substituted for T
, and the typed
value determination process is invoked recursively.↓
↓If the {variety} of T
is list, the [schema
normalized value] of the node is considered to be a
space-separated list of lexical forms, each of which has its own
type. For each of these lexical forms, the type of the corresponding
item is found in {item type definition}. This type is then substituted
for T
, and the typed value determination process is
invoked recursively for each member of the list.↓
The typed value determination process is guaranteed to result in a sequence of atomic values, each having a well-defined atomic type. This sequence of atomic values, in turn, determines the typed-value property of the node in the data model.
Bug 2790 notes that the [type definition] property of the
PSVI corresponds to the declared type of an element only most of the
time. If the xsi:type
attribute is used, the [type
definition] property will have as its value the type definition
named in the xsi:type
attribute. This can lead to
unexpected results when nodes in the data model are tested using
instance of
, in particular for an element E
declared with a union type U
having members
T1
and T2
, if one instance of E
has xsi:type="T2"
, then results can be unexpected
for:
for $e in /E return $e instance of element(E,U) for $e in /E return $e instance of element(E,T2)
This proposal attempts to make the results agree better with expectation by using not the [type definition] property of the element instance to identify the type of the element node, but instead the [type definition] given on the element declaration, when the latter is a union type and the former is one of its members.
The declared type is not used otherwise, since when the
type given in xsi:type
is actually derived from the
declared type, using the declared type would lose potentially useful
information.
The XML Query and XSL Working Groups may wish to file a comment against XML Schema asking that the declared type of an element or attribute be given a convenient name in the PSVI.
In passing, the proposal also changes words which suggest (following wording in XML Schema 1.0 which is now generally acknowledged to be misleading) that certain properties may be present or absent in the PSVI. In principle, all properties are necessarily present in the PSVI; they may or may not be accessible through a particular API.
The Working Groups should probably file a bug report against XML Schema 1.0 and 1.1, requesting that the misleading wording should be fixed in 1.1 and in an erratum to 1.0.
The precise definition of the schema type of an element or
attribute information item depends on the properties of the PSVI. In
the PSVI, [Schema Part 1] ↓only
guarantees the existence of either the↓ ↑defines a↑
[type definition] property, ↓or↓ ↑as well as
the↑ the [type definition namespace], [type definition
name] and [type definition anonymous] properties↑,
which are effectively short-cut terms for properties of the type
definition↑. ↓If the type definition refers to a union type,
there are further properties defined, that refer to the type
definition which actually validated the item's normalized value. These
properties are not used to determine the schema type of the node but
they may be used to determine the typed value of the node, as
described in 3.3.1.2 Typed Value
Determination.↓ ↑Further, the [element
declaration] and [attribute declaration] properties are
defined for elements and attributes, respectively. These declarations
in turn will identify the [type definition] declared for the
element or attribute. To distinguish the [type definition]
given in the PSVI for the element or attribute instance from the
[type definition] associated with the declaration, the former
is referred to below as the actual type and the latter as the
declared type of the element or attribute instance in
question.↑
The type depends on the ↑declared type, the actual type, and the ↑[validity] and [validation attempted] properties in the PSVI. If:
The [validity] and [validation attempted] properties exist and have the values "valid" and "full", respectively, the schema type of an element or attribute information item is represented by an expanded-QName whose namespace and local name correspond to the first applicable items in the following list:
↓If the [type definition] property exists:
If the {name} property is not absent, the {target namespace} and {name} properties of the [type definition] property;
Otherwise, the namespace and local name of the appropriate anonymous type name.↓
↓If [type definition anonymous] exists:
If it is false: the [type definition namespace] and the [type definition name] properties;
Otherwise, the namespace and local name of the appropriate anonymous type name.↓
The [validity] property exists and is "invalid", or
the [validation attempted] property exists and is
"partial", the schema type of an element is
xs:anyType
and the type of an attribute is
xs:anySimpleType
.
The [validity] property exists and is "notKnown",
and the [validation attempted] property exists and is
"none", the schema type of an element is
xdt:untyped
and the type of an attribute is
xdt:untypedAtomic
.
The [validity] or [validation attempted] properties
do not exist, the schema type of an element is
xdt:untyped
and the type of an attribute is
xdt:untypedAtomic
.
The prefix associated with the type names is implementation-dependent.