- From: <bugzilla@jessica.w3.org>
- Date: Wed, 10 Aug 2011 10:20:10 +0000
- To: public-qt-comments@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=13747
Summary: [XPath 3.0] Determinism of expressions returning
constructed nodes
Product: XPath / XQuery / XSLT
Version: Working drafts
Platform: PC
OS/Version: Windows NT
Status: NEW
Severity: normal
Priority: P2
Component: XPath 3.0
AssignedTo: jonathan.robie@gmail.com
ReportedBy: mike@saxonica.com
QAContact: public-qt-comments@w3.org
This bug is prompted by bug #13494 which raises issues against XSLT 2.0, but
there are some more specific issues in the 3.0 specifications (XSLT and XQuery
more than XPath) that need to be addressed.
I think XQuery 1.0 tries to ensure that expressions whose value is a newly
constructed node construct a new node each time they are evaluated. This
prevents optimizations such as loop-lifting and
common-subexpression-refactoring, which rewrite multiple evaluations of such an
expression with a single evaluation. Such a restriction on the optimizer is
acceptable so long as it is possible to analyze statically where the
restriction applies; it becomes a serious problem once we have more dynamic
constructs (such as calls on function items) where the static analysis is not
possible. As bug #13494 demonstrates, the introduction of generate-id() to
XQuery also introduces a new set of challenges for the optimizer.
It's not clear what our rules are here. We've introduced the possibility of
type annotations to say that function items are non-deterministic, but this is
all implementation-defined territory. I suspect that the rules that require a
strict interpretation of constructed node identity have actually disappeared
with the formal semantics, and we are left with a folklore as to what the
expected behaviour is, rather than a clear statement in the specs.
I'd like to suggest the possibility of abandoning the rule (that multiple
evaluations must return distinct nodes) entirely. That is, make it
implementation-dependent whether two evaluations of the same expression with
the same operands and the same context should return the same constructed node
or different constructed nodes. I find it hard to imagine a real application
that would be adversely affected by this change, other than the kind of
application discussed in bug #13494 that is deliberately taking advantage of
the rule in order to construct non-deterministic functions.
The benefit of making this change would be to simplify the semantics of the
language and make it easier to perform strong optimizations such as
loop-lifting.
--
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Wednesday, 10 August 2011 10:20:10 UTC