ACTION-1896 - Send a e-mail to Michael Kay about providing type information to the nodes in instances

All,

I just convinced myself that this action item was in fact pretty
important for XPath 2 support in XForms.

As a reminder, the idea is that XPath expressions that touch nodes
which have type information associated do use that information in the
computation. Here is a simple example which uses the `sum()` function
on decimals:

    https://gist.github.com/ebruchez/6234199

The benefit is that this avoids a lot of casts and mistakes such as
using the wrong number type. (I am on a crusade recently to get users
to use decimals, not doubles. We need to make this easier.)

The drawback is that you do need to keep types in mind when you write
expressions, and that this requires getting acquainted with a couple
of features of XPath which they might of have known previously. But I
think that is the price of writing correct calculations in XPath.

We do support this in our implementation so we have some experience.

So I did contact Mike Kay and the thread is attached with this
permission. In essence:

- There is nothing in XPath or XDM which prevents us to provide type
information to XPath expressions in XForms when available.

- Mike would expect atomization to automatically yield the typed value
of the node, as we suggested.

- This is not dependent on whether type information is known in
advance statically. Types can be provided at runtime, when the
expression runs.

- XForms has a mutable data model, but as long as the instance is
immutable for the duration of the evaluation of one XPath expression,
we are fine.

This encourages me to think that we should specify how type
information is made available to XPath expressions starting with XPath
2. I am willing to propose spec text for this.

One thing that is new from Mike is the following idea:

"where the field is calculated by an XPath expression, I think I would
expect the result of the XPath expression to be coerced to the
specified type using the XPath function conversion rules. (Or perhaps
an even stronger conversion, namely casting, would be appropriate.)"

This is an interesting idea for calculate. I am wondering if this
should also be considered when values are set by the `xf:setvalue`
function or a control.

Comments welcome,

-Erik

Forwarded conversation
Subject: Exposing type information to XPath expressions in XForms 2.0
------------------------

From: Erik Bruchez <ebruchez@orbeon.com>
Date: Wed, Aug 14, 2013 at 10:39 AM
To: Michael Kay <mike@saxonica.com>


Hi Mike,

I hope you are doing well.

I have been tasked by the XForms WG to ask you for feedback on this. I
hope won't be too hard to answer.

In XForms 2, we are adding support for XPath 2. One area which is not
covered yet is whether and/or how the XForms `type` property should
influence the evaluation of XPath expressions.

To give you an example:

<xf:model>
    <xf:instance>
        <data>
            <foo>3.14</foo>
            <bar/>
            <baz/>
        </data>
    </xf:instance>
    <xf:bind ref="foo" type="xs:decimal"/>
    <xf:bind ref="bar" type="xs:date"/>
    <xf:bind ref="baz" calculate="../foo * 2"/>
</xf:model>

<xf:input ref="foo"/>
<xf:input ref="bar"/>
<xf:output value="baz"/>

In XForms 1.1 and, so far, 2.0, the type information is used for:

- data validation (which marks controls as invalid and influences xf:submission)
- controlling the appearance of UI controls (i.e. date picker for the
xs:date value above)

But so far, it isn't used to influence the evaluation of XPath
expressions at all.

Take the `calculate` expression above. It accesses the value of the
`foo` element. It certainly would be great to be able to use the
*typed value* of it, as it avoids annoying casts. And we have noticed
after years of user feedback that casts are less than ideal and users
tend to write more incorrect expressions than they should.

So in our implementation we have implemented support for type
information. It works as follows:

- XPath expressions are compiled without knowledge of the type information.

- At runtime, when an expression accesses the typed value of a node,
if the actual data does not satisfy the type, an internal type
exception is raised. Otherwise, the typed value is returned and
expression evaluation continues.

- An expression can always access the string value of a node with the
string() function.

- An expression that fails because of a type exception is handled
depending on context. For example, a `calculate` sets a blank result
value, and a `constraint` expression is considered to be not
satisfied.

I have also written a blog post which covers some of the above:

  http://blog.orbeon.com/2013/01/better-formulas-with-xpath-type.html

This approach has worked very well in our implementation using the
Saxon XPath implementation, which motivates the desire for
standardization.

The two questions we have for you are:

1. Do you see anything in this approach which might not be compatible
with the XPath specification? For example:

    - The fact that, statically, an expression may not have type
information available for the data it accesses, yet at runtime, typed
values are provided?

    - Since the data model in XForms is mutable, and binds are
re-applied, a given expression might run in situations where different
data and different types are provided.

2. Do you think that, at a high-level, the approach described is
reasonable, at least as far as XPath is concerned?

Any comment you might have on this will be welcome.

Thanks and regards,

-Erik

----------
From: Michael Kay <mike@saxonica.com>
Date: Wed, Aug 14, 2013 at 1:29 PM
To: Erik Bruchez <ebruchez@orbeon.com>


Im rather surprised to read that you can say type="xs:decimal" on a
field and then have something in there that isn't a decimal. I would
have expected this (a) to impose a constraint that any value in the
field is a decimal, and (b) to ensure that the typed value of the
element in question is xs:decimal, meaning that atomization
automatically yields a decimal. But I guess that might be too crude a
way to deal with error handling?

And where the field is calculated by an XPath expression, I think I
would expect the result of the XPath expression to be coerced to the
specified type using the XPath function conversion rules. (Or perhaps
an even stronger conversion, namely casting, would be appropriate.)

>
> The two questions we have for you are:
>
> 1. Do you see anything in this approach which might not be compatible
> with the XPath specification? For example:
>
>   - The fact that, statically, an expression may not have type
> information available for the data it accesses, yet at runtime, typed
> values are provided?

I don't think there's any suggestion in XPath or XDM that a node can
only have a typed value if there is static type information available.
>
>   - Since the data model in XForms is mutable, and binds are
> re-applied, a given expression might run in situations where different
> data and different types are provided.

I'm not sure what the execution model is here, but I think the only
constraint in XPath is that the instance is immutable for the duration
of the evaluation of one XPath expression.
>
> 2. Do you think that, at a high-level, the approach described is
> reasonable, at least as far as XPath is concerned?
>
It makes sense to me.

Mike


----------
From: Erik Bruchez <ebruchez@orbeon.com>
Date: Wed, Aug 14, 2013 at 2:06 PM
To: Michael Kay <mike@saxonica.com>


Hi Mike,

Thanks for the quick reply.

> Im rather surprised to read that you can say type="xs:decimal" on a field and then have something in there that isn't a decimal. I would have expected this (a) to impose a constraint that any value in the field is a decimal, and

In XForms an XML document (AKA XForms instance) can have validation
constraints (in the general sense, including types, XPath constraints,
and requiredness), which at a given moment in time might not be
satisfied.

In particular the `type` attribute is a way to express the desired
type of the value. There is nothing that prevents initial data,
instance replacements, controls, or the setvalue, insert and delete
actions, from manipulating the instance in ways that cause constraints
to fail.

You can see a typical XForms session as working on an XML document
with constraints which gradually get satisfied as the user enters
data, until the document validates and can be submitted (that's only a
minimalist XForms scenario of course).

> (b) to ensure that the typed value of the element in question is xs:decimal, meaning that atomization automatically yields a decimal.

This is exactly what we are suggesting for XPath 2/XForms 2. So that's good.

To illustrate how to handle nodes that might not have the right type,
I show two scenarios in this example:

https://gist.github.com/ebruchez/6234199

The first one:

    sum((../a, ../b, ../c))

blows up is any node is not an xs:decimal. Blowing up means that the
calculation fails and the resulting value is blanked.

And:

    sum((../a, ../b, ../c)[string() castable as xs:decimal], 0.0)"

always succeeds, by testing the actual type of the data.

> And where the field is calculated by an XPath expression, I think I would expect the result of the XPath expression to be coerced to the specified type using the XPath function conversion rules. (Or perhaps an even stronger conversion, namely casting, would be appropriate.)

Good point, that would also make sense. We'll have to think about it.

> I don't think there's any suggestion in XPath or XDM that a node can only have a typed value if there is static type information available.

Good to hear.

>>   - Since the data model in XForms is mutable, and binds are
>> re-applied, a given expression might run in situations where different
>> data and different types are provided.
>
> I'm not sure what the execution model is here, but I think the only constraint in XPath is that the instance is immutable for the duration of the evaluation of one XPath expression.

Which XForms does ensure, and I think requires for extension functions as well.

>> 2. Do you think that, at a high-level, the approach described is
>> reasonable, at least as far as XPath is concerned?
>>
> It makes sense to me.

Excellent, thanks for your feedback!

Best,

-Erik

Received on Thursday, 15 August 2013 22:21:16 UTC