Re: [expath] MongoDB Module: Working Draft

Hi Hans-Jürgen,

Finally some feedback:

> Could you give an example where a non-empty value of the function call might
> disturb the client? If he is not interested, he either does not assign the
> value to a variable, or, if the call is within a let or for clause, simply
> does not use the bound variable. I just have not yet understood the issue.

One general problem with all non-deterministic and side-effecting
expressions in XQuery is that their behavior is not formalized in the
specification of XQuery (and I doubt that anyone will tackle this in
the foreseeable future..).

Let's take an example. In the following query...

  let $a := local:a()
  let $b := local:b()
  return ()

...a query processor may decide to first evaluate local:b() and then
local:a() without violating the rules of the language. It may also
skip the evaluation of the all expressions, because the result will be
an empty sequence anyway.

In the MongoDB spec, we added the simple sentence, saying that "A
query processor must ensure that non-deterministic functions are not
relocated or rewritten in the query, and that its results are not
cached at runtime.". This rule is by no means complete, but it is
supposed to indicate that the order in which a users has written down
function calls in the original query should be preserved. Currently,
it does not tell anything about the question if an expression must be
evaluated, and that it must not be optimized away. It could be added,
but in the end I believe that a general EXPath document may be a
better place for that.

Back to the original problem statement: We observed that client may
not be interested in any result output if a query is only updating. If
a user inserts some data and runs a MongoDB query, the expression
could look as follows:

  let $id := mongo:connect(...)
  return (mongo:insert(...), mongo:find(...))

We could "swallow" the result of an updating function by e.g. adding a
false() predicate to the insert expression..

  mongo:insert(...)[false()]

…but we then need to be sure that it is not optimized away by the
query processor. The same applies if we bind the result of the
updating function to a dummy variable:

  let $id := mongo:connect(...)
  let $_ := mongo:insert(...)
  return mongo:find(...)

I need to add, though, that it could even apply in my first example (a
query processor could check if the result of a function call will be
an empty sequence, and skip the call), but it may not be as obvious.

In a nutshell: The general challenge here is much broader. My
practical approach would be to define functions in a way that results
will only be returned if functions are non-updating. This is the way
it has been done in the other EXPath modules so far.

Feedback is welcome as usual,
Christian

Received on Sunday, 15 March 2015 12:04:48 UTC