[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

--- Comment #7 from Michael Kay <mike@saxonica.com> ---
PROPOSAL
========

The proposal is that we distinguish atomization from other absorption
operations. For atomization, we permit the operand to be crawling.

Specifically, we introduce a fifth kind of operand usage, called atomization,
which differs from absorption in that in the general streamability rules, in
the table in 1.b.iii.B, the entry for "Atomization/Crawling" is "Consuming"
rather than "Free-Ranging".

This operand usage would apply whenever the semantics of the operation invoke
atomization. For example: function calls where the required type is atomic; the
data() function; AVTs; the select attribute of xsl:value-of. It would also
apply to the small number of operations that get the string value of a node,
for example string() and string-length(). In fact, it would apply to most cases
where we currently use usage="absorption", with the exception of constructs
like xsl:for-each and xsl:apply-templates and xsl:iterate where the processing
of descendant elements is defined by user-written code rather than built-in
code.

I'm then inclined to rename the existing usage=absorption as usage=consumption,
to preserve one-letter abbreviations for usages, and because there's a clear
link between usage=consumption and sweep=consuming.

A typical implementation will work as follows: when it encounters the start tag
of a selected node, it opens a buffer for the string value of a node, and adds
this buffer to the end of a queue. When it encounters a text node, it copies
the value to all currently open string-value buffers. When it encounters the
end tag for a selected node, it computes the atomized value of that node and
seals the buffer; it then delivers (and dequeues) the atomized value of all
buffers that are sealed and that are not queued behind one that is still open.
The number of open buffers on the queue is determined by the amount of nesting
of selected nodes in the crawling sequence, which in the vast majority of
practical cases will be one; if there are no nested nodes in the crawling
sequence, then each atomic value will be delivered as soon as the end tag for
the corresponding node is encountered.

We could extend the same mechanism to all absorption operations on crawling
sequences (for example, xsl:apply-templates and xsl:for-each). The reason I
don't propose doing this is that (a) the amount of data in each buffer is
unbounded (as it depends on user code), and (b) with operations like
apply-templates, as distinct from atomization, it is much more likely that the
result of the crawling expression will actually contain nested nodes.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

Received on Friday, 16 May 2014 10:04:43 UTC