RE: XQuery

Paul already forwarded the citations from the SQL-2003 FDIS spec. I am
just going to add that SQL implementations actually make use of that
non-deterministic behaviour to choose better performing plans.

However, I agree that less errors and more empty sequences help in
reducing the need for non-deterministic behaviour while removing the
ability to tell the user at runtime that he may have made a mistake (one
reason that some implementations will probably often raise static errors
where the chance of runtime mistakes are too big or the performance
impact would be too large). 

However, given that all predicates in XQuery are existentially
quantified, I do not think you will really gain much from three-valued
logic per se (although a little if you use the value compares with using
() as the third value). I fully agree for example, that the general
comparison operations (=, < etc) should not have been existentially
quantified, but we inherited that from XPath 1.0. As the semantics for
[] (after which we modeled the semantics of where). 

Adding yet another value to represent the 3rd truth-value, since () is
caught in the XPath 1.0 backwards-compatibility semantics is also not a
good idea since it adds lots of complexity to the language without
giving anything that is worth the trouble. This is kind of like NaN. A
value that I would not add (since it basically is NULL in SQL and () in
XQuery for all intents), but I have now forced upon me by IEEE rules
that weren't aware, that the data modeling people already had ways to
deal with the concept of "not a value".

Best regards
Michael

> -----Original Message-----
> From: public-qt-comments-request@w3.org [mailto:public-qt-comments-
> request@w3.org] On Behalf Of Guido Moerkotte
> Sent: Friday, October 17, 2003 3:20 AM
> To: Liam Quin; public-qt-comments@w3.org
> Cc: Michael Brundage
> Subject: Re: XQuery
> 
> 
> 
> Michael Kay writes:
> > I think all database query languages have to allow the
implementation to
> > choose efficient access paths, and different access paths might or
might
> not
> > hit errors. Therefore, I don't think there's any practical
alternative
> to
> > saying that dynamic errors will depend to some extent on the
> implementation.
> > Note that database query languages have much more stringent
requirements
> > here than conventional programming languages, because optimizations
(eg.
> of
> > joins) give performance improvements of a factor of 1000000, not
just a
> > factor of 2.
> 
> Michael Rys writes:
> > a)       SQL is also indeterminstic (non-deterministic is probably
the
> > correct English word, no?) as are other database query languages for
the
>   /* yes non-deterministic is better but indeterministic is easier to
type
> */
> > reasons Michael explains below.
> 
> The point is that you need many different query execution plans that
are
> equivalent
> to the original query. Theoretically, equivalence means that they give
the
> same result for all
> databases. (in practice this equivalence is defined by the spec.)
> The huge search space (the set of all plans equivalent to the query)
of
> relational
> systems is great for query optimization.
> What are the reasons for the huge search space:
> 1) Joins: on sets and bags they commute and are associative
>    /* not that order preserving joins do not commute */
> 2) reorderability: one can reorder selections, joins, maps freely and
>    some other operators under certain conditions.
> This has *NOTHING* to do with indeterminism. In fact, there are only
very
> few cases
> where SQL is indeterministic and these cases do *NOT* contribute to
the
> search space.
> Of course, as  Michael Brundage points out, if runtime errors are
present,
> indeterminism is necessary to retain the search space.
> I couldn't agree more with what he writes.
> My whole point is that indeterminism should be kept at a minimum and I
> think the
> use of NULL values and three-valued logic can help here.
> 
> Michael Kay writes:
> > The adverse effects of indeterminism can be reduced if you reduce
the
> number
> > of run-time errors, though this has to be balanced against the
> desirability
> >  of reporting genuine programmer errors rather than giving wrong
> answers. In
> > some cases, like this one, I think we are reporting errors that
would be
> > better treated as not being errors; and it would be perfectly
feasible
> to
> > define the rules so that comparing an untyped value against a typed
> value
> > testing castability to the required type before attempting the
> conversion.
> 
> I agree with Michael Rys that this is not a nice solution.
> (see also my previous email.)
> 
> Michael Rys writes:
> > b)       Actually the reason why we removed three-valued logic was
that
> > all predicates in XQuery and XPath are existentially quantified
which
> > always maps the third truth-value to false. The only expression that
> > thus would preserve the third-value was an expression of the form $a
eq
> > 3 (if $a is ()) that is not part of a predicate. We considered this
to
> > be a very small use case...
> 
> > In addition, in XPath/XQuery, = and predicates in general are mapped
to
> > an existential quantification, so even if you have three valued
logic,
> > existential quantification normally maps the third-value to false.
This
> > design is in the language for two reasons:
> >
> > 1. XPath 1.0 did it.
> > 2. Existential quantification on predicates is the more natural
> > semantics if you operate on semi-structured data.
> 
> Well, seems everybody has problems with existential quantification and
> three-valued logic. Before I come back to this point let me say the
> following:
> 
> I very much like Michael Kay's example/argument on (explicit/implicit)
> union types.
> I think this is also a strong case against runtime exceptions.
> 
> Now existential quantification:
> The problem is not with the existential quantification but with the
> application
> of existential quantification where it is not needed.
> Let us denote by
>   S_=_S comparison of two sequences
>     =_S  testing of membership of a single item in a sequence
>     =_2 comparison of single items with 2-valued result
>     =_3 comparison of single items with 3-valued result
> Let us denote by x,y single items and by X and Y sequences.
> 
> Note that XQuery identifies
>   single items with singleton sequences containing that single item.
> Hence, (3) is the same as 3. (something I personally dislike)
> 
> From this we could follow that there are no single items and
> everything is a sequence.
> Then, the XQuery comparison = always applies existential semantics
S_=_S
> where X S_=_S Y in XQuery is defined as:
>    X S_=_S Y ::= exists x in X, y in Y x =_2 y
> As pointed out by Michael:
> Even if we define
>    X S_=_S Y ::= exists x in X, y in Y x =_3 y
> this does not help much.
> 
> But how about:
>   X S_=_S Y ::=
>      x =_S Y  if (X = (x) or X = x) and Y is neither a single item nor
a
> singleton sequence
>      y =_S X  if (Y = (y) or Y = y) and X is neither a single item nor
a
> singleton sequence
>      x =_3 y  if (X = (x) or X = x) and (Y = (y) or Y = y)
>      exists x in X, y in Y x =_3 y else
> with
>   x =_S Y ::=
>     UNKNOWN  if x is NULL
>     x =_3 y if (Y = (y) or Y = y)
>     exists y in Y x =_3 y else
> 
> Best,
>  Guido
> 

Received on Friday, 17 October 2003 21:44:47 UTC