ACTION 2016-06-30-001: ABr to derive a concrete set of proposed changes to the spec from the table in the attachment to public-xsl-wg/2016Jun/0005.html

I have thought about this and while we agreed in an earlier discussion that the spec is not necessarily broken, I believe some suggestions for improvements or clarifications are in place. I propose therefore all, or some, of the following, depending how we feel about it:

To recap: the referred to document (which can be accessed at https://lists.w3.org/Archives/Public/public-xsl-wg/2016Jun/0005.html) lists the amalgamation of all invocation variants of processing a stylesheet and the expected outcome thereof for what would raise an error, processes as expected, or is unclear how it should be processed at all.

#1 (perhaps the most apparent) the paradox caused by "MUST" in the sentence " If a construct is guaranteed-streamable then it must be processed using streaming." should be lifted.

Rationale: the input node can be an in-memory copy, can be an atomic sequence etc. In such cases, there is nothing to stream. In other cases, the input tree may be so small that a processor deems is more efficient to load it in memory and process it the normal way.

I do not suggest to lift the requirement to check the stylesheet for streamability, I just say we should clarify/update this statement. 

See Bug 29690, which was accepted and marked RESOLVED, but I do not see the changes in the latest Draft

Suggestion for a Note or mandatory text:

"If the user invokes a guaranteed streamable construct with a grounded, copied node or an atomic item, the processor MUST process that node as if it set the initial posture and sweep of any invocation construct to *grounded* and *motionless*. This may have the effect that if the stylesheet would otherwise be not guaranteed streamable, is now guaranteed streamable."

#2 Explain that a function, which is an invocation construct, can be called with an initially streamed node. 

Rationale: we say this for other invocation constructs, but not for a function.

#3 Clarify that the initial match selection, other than the input to xsl:stream, can be a sequence of streamed, to mixed, nodes

Rationale: I think we should say something, somewhere, about the initial match selection. While this is an API matter, it should be made clear that we allow the input sequence to be a sequence of streamed (document) nodes, or a mixed set (mixed with atomics or grounded/copied nodes)

#4 Clarify that we do not prevent the processor from providing an in-memory (or: grounded) copy of a node as input. This applies to IMS and xsl:stream.

Rationale: even with xsl:stream, since we are vague about what streaming is exactly (buffer can be large enough to load whole document), and moreover, since an API design may allow passing a streamed XDM or a non-streamed XDM, we should make it very clear, or perhaps even "implementation-defined" (that is: that a processor MUST document how it does this), that processors can specify how a node is passed to the stylesheet, and that this does NOT mean that the node is necessarily streamed.

#5 Say something about (large) atomics and that the XSL REC does not specify rules on how these are streamed, if at all

Rationale: I couldn't find this, but I thought we had it in an earlier draft. It can be beneficial to add a Note that says:

"This document does not provide rules for streaming atomics, for instance streaming the result of a call to fn:unparsed-text-lines. Even though that function was added to allow streaming of external resources that are not XML documents, from the point of view from this specification, such documents are stable and any atomic sequence, however long, is considered grounded and motionless by definition. This does not prevent processors to stream such input, however."

#6 On fn:unparsed-text-lines and fn:unparsed-text, lift the limit that F&O gives on these functions by allowing them to be non-deterministic (similar to fn:doc, which allows lifting of that restriction at user option)

#7 Add a Note that explains that the specification leaves it entirely up to the processor when a streamed node is given as input to a non-streaming construct. Allow the processor to raise an error in such case.

 Suggestion:

"If the input is a streamed node but the initial invocation construct is not guaranteed streamable, than the processor MAY attempt processing using streaming in a similar way it MAY attempt streaming of not guaranteed-streamable constructs. The processor MAY also raise an error, or MAY buffer the entire stream and process it in a non-streaming way."

#8 Clarify that non-streaming processors may optionally process a streamed node by buffering it.

#9 Add a small table that lists the invocation constructs we currently have. A smaller version of the table in message 005 of the June 2016 archive may help untangle the text that currently describes the myriad of required/supported invocation methods.

--------------------

I think that is all :). Not all of the above is equally important, of course, but this is what I could deduct from the message and the document, and from the following discussion.

Cheers,
Abel

Received on Thursday, 1 September 2016 15:08:42 UTC