[Bug 13747] New: [XPath 3.0] Determinism of expressions returning constructed nodes from bugzilla@jessica.w3.org on 2011-08-10 (public-qt-comments@w3.org from August 2011)

From: <bugzilla@jessica.w3.org>
Date: Wed, 10 Aug 2011 10:20:10 +0000
To: public-qt-comments@w3.org
Message-ID: <bug-13747-523@http.www.w3.org/Bugs/Public/>

http://www.w3.org/Bugs/Public/show_bug.cgi?id=13747

           Summary: [XPath 3.0] Determinism of expressions returning
                    constructed nodes
           Product: XPath / XQuery / XSLT
           Version: Working drafts
          Platform: PC
        OS/Version: Windows NT
            Status: NEW
          Severity: normal
          Priority: P2
         Component: XPath 3.0
        AssignedTo: jonathan.robie@gmail.com
        ReportedBy: mike@saxonica.com
         QAContact: public-qt-comments@w3.org


This bug is prompted by bug #13494 which raises issues against XSLT 2.0, but
there are some more specific issues in the 3.0 specifications (XSLT and XQuery
more than XPath) that need to be addressed.

I think XQuery 1.0 tries to ensure that expressions whose value is a newly
constructed node construct a new node each time they are evaluated. This
prevents optimizations such as loop-lifting and
common-subexpression-refactoring, which rewrite multiple evaluations of such an
expression with a single evaluation. Such a restriction on the optimizer is
acceptable so long as it is possible to analyze statically where the
restriction applies; it becomes a serious problem once we have more dynamic
constructs (such as calls on function items) where the static analysis is not
possible. As bug #13494 demonstrates, the introduction of generate-id() to
XQuery also introduces a new set of challenges for the optimizer.

It's not clear what our rules are here. We've introduced the possibility of
type annotations to say that function items are non-deterministic, but this is
all implementation-defined territory. I suspect that the rules that require a
strict interpretation of constructed node identity have actually disappeared
with the formal semantics, and we are left with a folklore as to what the
expected behaviour is, rather than a clear statement in the specs.

I'd like to suggest the possibility of abandoning the rule (that multiple
evaluations must return distinct nodes) entirely. That is, make it
implementation-dependent whether two evaluations of the same expression with
the same operands and the same context should return the same constructed node
or different constructed nodes. I find it hard to imagine a real application
that would be adversely affected by this change, other than the kind of
application discussed in bug #13494 that is deliberately taking advantage of
the rule in order to construct non-deterministic functions.

The benefit of making this change would be to simplify the semantics of the
language and make it easier to perform strong optimizations such as
loop-lifting.

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Wednesday, 10 August 2011 10:20:10 UTC