> Personally I want that query to fail with an error, because
> my data is not well formed. Of course, then I want to go and
> fix the bug in my W3C XML Schema instance that allowed the
> incorrect data. Or, I go and fix the bug in my query that
> failed to allow for the actual permitted variation in my data.
>
> In other words, my query has to match the data, and if it
> does not, I don't want the query processer to bend over
> backwards to prevent me from discovering my error. It's *my*
> error in writing the query incorrectly, or in loading bad data.
I think I would argue that there is no error here. In the digital version of
the UK 1881 census, age in years is represented as an integer, e.g. 3, and
age in months for infants is represented as a string in the form "3m". In
XML Schema, we would model it as a union type. If you want to find the
people who are three years old, it seems perfectly reasonable to search for
"age=3", and if you want to search for those who are three months, it seems
reasonable to search for "age='3m'". Values that conform to a different
branch of the union type than the one you are searching for are not in
error, they are simply of no interest to you.
Of course this doesn't solve all the problems in handling a union type; in
an ideal world I would like the query "age<3" to include all the infants
whose age is given in months. I can do that with a function. But I shouldn't
have to use such a function for every query on a union type.
And yes, the database designers could have modeled this field as a
yearMonthDuration. But they didn't (and I don't see why they should).
Michael Kay