RE: XQuery

> In fact, there are only very few cases where SQL is indeterministic
and 
> these cases do *NOT* contribute to the search space.

I have to disagree with this point.  As proof I cite various locations
from the most recent FDIS for ISO 9075 that discuss determinism of the
SQL language:

Subclause 4.16 "Determinism" of ISO 9075 states:

===
4.16 Determinism

In general, an operation is deterministic if that operation assuredly
computes identical results when repeated with identical input values.
For an SQL-invoked routine, the values in the argument list are regarded
as the
input; otherwise, the SQL-data and the set of privileges by which they
are accessed is regarded as the input.  Differences in the ordering of
rows, as permitted by General Rules that specify
implementation-dependent
behavior, are not regarded as significant to the question of
determinism.

NOTE 28 - Transaction isolation levels have a significant impact on
determinism, particularly isolation levels other than SERIALIZABLE.
However, this International Standard does not address that impact,
particularly because of the difficulty in clearly specifying
that impact without appearing to mandate implementation techniques (such
as row or page locking) and because different SQL-implementations
almost certainly resolve the issue in significantly different ways.

Recognizing that an operation is deterministic is a difficult task, it
is in general not mandated by this International Standard. SQL-invoked
routines are regarded as deterministic if the routine is declared to be
DETERMINISTIC; that is, the SQL-implementation trusts the definer of the
SQL-invoked routine to correctly declare that the routine is
deterministic. For other operations, this International Standard does
not label an operation as deterministic; instead it identifies certain
operations as "possibly non-deterministic". Specific definitions can be
found in other subclauses relative to <value expression>, <table
reference>, <table primary>, <query specification>, <query expression>,
and <SQL procedure statement>.

Certain <boolean value expression>s are identified as "retrospectively
deterministic". A retrospectively deterministic <boolean value
expression> has the property that if it is True at one point time, then
it is True for all later points in time if re-evaluated for the
identical SQL-data by an arbitrary user with the identical set of
privileges. The precise definition is found in Subclause 6.34, "<boolean
value expression>".
===

Subclause 4.27 "SQL-invoked routines" provides the information on SQL
functions:

===
An SQL-invoked routine is either deterministic or possibly
non-deterministic. An SQL-invoked function that is deterministic always
returns the identical return value for a given list of SQL argument
values. An SQL-invoked procedure that is deterministic always returns
the identical values in its output and inout SQL parameters for a given
list of SQL argument values. An SQL-invoked routine is possibly
non-deterministic if it might
produce nonidentical results when invoked with the identical list of SQL
argument values.
===
 
Subclause 6.25 <value expression> provides the following description of
those <value expression>s that are possibly non-deterministic:

===
7) A <value expression> or <nonparenthesized value expression primary>
is possibly non-deterministic if it generally contains any of the
following:

a) A <datetime value function>.

b) A <next value expression>.

c) A <cast specification> that either is, or recursively implies through
the execution of the General Rules of Subclause 6.12, "<cast
specification>", one of the following:

  i) A <cast specification> whose result type is datetime with time zone
and whose <cast operand> has declared type that is not datetime with
time zone.

  ii) A <cast specification> whose result type is an array type and
whose <cast operand> has a declared type that is a multiset type.

d) An <array value constructor by query>.

e) A <datetime factor> that simply contains a <datetime primary> whose
declared type is datetime without time zone and that simply contains an
explicit <time zone>.

f) An <interval value expression> that computes the difference of a
<datetime value expression> and a <datetime term>, such that the
declared type of one operand is datetime with time zone and the other
operand is datetime without time zone.

g) A <comparison predicate>, <overlaps predicate>, or <distinct
predicate> simply containing <row value predicand>s RVP1 and RVP2 such
that the declared types of RVP1 and RVP2 have corresponding constituents
such that one constituent is datetime with time zone and the other is
datetime without time zone.

NOTE 110 - This includes <between predicate> because of a syntactic
transformation to <comparison predicate>.

h) A <quantified comparison predicate> or a <match predicate> simply
containing a <row value predicand> RVP and a <table subquery> TS such
that the declared types of RVP and TS have corresponding constituents
such that one constituent is datetime with time zone and the other is
datetime without time zone.

NOTE 111 - This includes <in predicate> because of a syntactic
transformation to <quantified comparison predicate>.

i) A <member predicate> simply containing a <row value predicand> RVP
and a <multiset value expression> MVP such that the declared type of the
only field F of RVP and the element type of MVP have corresponding
constituents such that one constituent is datetime with time zone and
the other is datetime without time zone.

j) A <submultiset predicate> simply containing a <row value predicand>
RVP and a <multiset value expression> MVP such that the declared type of
the only field F of RVP and the declared type of MVP have corresponding
constituents such that one constituent is datetime with time zone and
the other is datetime without time zone.

k) A <multiset value expression> that specifies or implies MULTISET
UNION, MULTISET EXCEPT, or MULTISET INTERSECT such that the element
types of the operands have corresponding constituents such that one
constituent is datetime with time zone and the other is datetime without
time zone.

l) A <value specification> that is CURRENT_USER, CURRENT_ROLE,
SESSION_USER, SYSTEM_ USER, or CURRENT_PATH.

m) A <routine invocation> whose subject routine is an SQL-invoked
routine that is possibly non-deterministic.

n) An <aggregate function> that specifies MIN or MAX and that simply
contains a <value expression> whose declared type is based on a
character string type, user-defined type, or datetime with time zone
type.

o) An <aggregate function> that specifies INTERSECTION and that simply
contains a <value expression> whose declared element type is based on a
character string type, a user-defined type, or a datetime type with time
zone.

p) A <multiset value expression> that specifies MULTISET UNION DISTINCT,
MULTISET EXCEPT, or MULTISET INTERSECT and whose result type's declared
element type is based on character string type, a user-defined type, or
a datetime type with time zone.

q) A <multiset set function> whose declared element type is based on a
character string type, a userdefined type, or a datetime type with time
zone.

r) A <window function> that specifies ROW_NUMBER or whose associated
<window specification> specifies ROWS.

s) A <query specification> or <query expression> that is possibly
non-deterministic.
===

I will not bother to quote the rules for <table reference>, <table
primary>, <query specification>, <query expression>, and <SQL procedure
statement> since they are similar to those above.  

I think a simple review of the SQL standard indicates there are MANY
places where ISO 9075 SQL expressions are possibly non-deterministic. 
 
/paulc

Paul Cotton, Microsoft Canada 
17 Eleanor Drive, Nepean, Ontario K2E 6A3 
Tel: (613) 225-5445 Fax: (425) 936-7329 
mailto:pcotton@microsoft.com

  

Received on Friday, 17 October 2003 10:04:48 UTC