ACTION: 2015-12-03-002 - bug 29146 - fn:transform options

> 
> ACTION: 2015-12-03-002: Mike Kay to respond to the comments made in
>         comments 4-7 in bug 29146.
> 



DL comments:

1. Don't forget to update the required type of package-version from xs:decimal to xs:string, with default "*".

Done.

2. I think initial-mode should be removed from the lists 2.d and 3.d, as I believe it is only relevant for apply-templates invocation (but correct me if I'm wrong).

Correct, done.

3. In 3.c.ii. (the call-template invocation under XSLT 3.0) I think initial-template should also be optional (not mandatory), with default "xls:initial-template".

My first reaction was to make initial-template optional, and add the rule:

If none of initial-template, initial-mode, initial-match-selection, or initial-function is present, then call-template invocation is used with the initial template  name defaulted toxsl:initial-template.

However, this is incompatible with the rule for 1.0 and 2.0 invocation, where

transform( map{ 'stylesheet-location': 'test.xsl', 'source-node': $input } )

causes apply-templates invocation. Since this is the simplest and most traditional way of invoking a transformation, I think this combination should continue to be treated as apply-templates invocation. So I have instead made the rule

The manner of invocation is determined as follows:

(a) if initial-function is present, then call-function invocation

(b) if initial-template is present, then call-template invocation

(c) if initial-match-selection or source-node is present, then apply-templates invocation

(d) if none of the above is present, then call-template invocation using the template name xsl:initial-template.

If none of initial-template, initial-mode, initial-match-selection, or initial-function is present, then call-template invocation is used with the initial template  name defaulted to xsl:initial-template.

ABr comments:

1) What happens when "serialization" is chosen for the result and the serialized tree cannot be represented by the calling execution environment, for instance because XML 1.1 is serialized, JSON is chosen in xsl:output and contains values not representable directly as an xs:string, or an implementation-defined type is serialized (jpg, other).

I note that there is no corresponding error for fn:serialize(). I don't think it can happen. I don't think any of the serialization methods can generate a character that's not present in the value being serialized, other than simple ASCII markup characters.

2) Should we allow users to specify additional stylesheets, i.e. for the ones referenced in by xsl:import? If the stylesheet as a whole is generated dynamically, the base-stylesheet-uri option will not be sufficient.

I think we should not attempt to do this. Too complex.

3) Can we predefine names for other options of the dynamic and static context, i.e. timezone, date, collation, collection? Currently these would go into vendor-options, but standardizing these as key names for the map will make implementation-independent calls easier, even if such parts of the context are not supported (which could be an error, a default, or ignored).

The transformation APIs in common use do not provide the ability to set most of these things externally. I think we can live without this.

4) It would be nice if there's some way to distinguish the phase of the returned result in error scenarios. I.e., a dynamic error in a use-when expression (static compilation phase) should be distinguishable from a dynamic error during evaluation or priming. But I'm not sure how this could be made workable.

File this under "nice to have".

5) I'm slightly ambivalent about "stylesheet-location", since it is technically not a location.

The obvious alternative is stylesheet-uri, but then we have to go for package-uri, and it becomes unclear whether we are talking about the package name or the package location. So I prefer what we've got. Also remember we're dealing with an existing spec that's at CR status, so non-editorial changes need a very strong justification.

6) Under 2.d, "initial-mode" is repeated (it is already under 2.c)

Fixed.

7) the entry for "source-node" is under-defined. For XSLT 1.0 it must be a document node, for XSLT 2.0, it can be any node. The second part of this sentence refers only to XSLT 3.0, but this is not specified.

The XSLT 1.0 spec doesn't say that it must be a document node; it's easy to read the spec as assuming this, but there are 1.0 processors that don't impose this constraint and the spec seems to permit this. In particular, it's common practice under JAXP to supply a DOMSource that wraps the document element rather than the document node (and people are often surprised that this doesn't invoke the match="/" template rule).

I changed the explanation to read:

When <code>source-node</code> is supplied then the <code>global-context-item</code> (the context item
                     for evaluating global variables) is the root of the tree containing the supplied node. In addition,
                     for apply-templates invocation, the <code>source-node</code> acts as the <code>initial-match-selection</code>,
                     that is, stylesheet execution starts by applying templates to this node.

7a) the type for "source-node" can be "node()?", XSLT 2.0 (not sure about 1.0) allows it to be empty.

I think we should keep it as node(). If you don't want to supply a node, don't supply this option. Otherwise we have to consider whether empty means the same as absent, and it all gets more complicated.

8) typo "Theese" --> "These" (under serialization-params).

Fixed.

9) Serialization params requires a QName as key, should we specify that the xsl:output params exist in no namespace?

I have added that standard serialization params such as "method" and "indent" are supplied as QNames in no namespace.

10) If "saved" is the delivery format, is it an error if the document cannot be saved?

I have added FOXT0005 for this.

11) On the same token, the transformation may succeed, but errors may be raised after the transformation (i.e., saving the result). In such cases it may be beneficial to have a result set *and* the reported errors (as opposed to just blowing up).

I think we should stick to the general rule here: errors are fatal; though after a fatal error the state of filestore is undefined.

12) XSLT 3.0 defaults to "xsl:initial-template" if not initial match selection or template is provided (i.e., with call-template invocation). I think this should be reflected by the options by allowing this as default if "initial-template" is not provided and no "source-node" or "initial-match-selection" is given.

Discussed above under Debbie's comments.

13) Is "function-params" required? With an initial function that takes no params, this could be optional, right?

We use the size of the function-params value to decide the arity, so I think it should be required. We could interpret an absent value as an empty sequence, but I think there's no harm asking the user to be explicit.

14) XSLT 3.0 with packages, it looks as if "package-version" is only allowed with "package-name". Is that intentional?

Yes. I guess in the other cases we could still verify that the actual package version matches the requested version, but the version plays no role in finding the package so I don't think it's much use.

15) XSLT 3.0: delivery format should probably not default to "document", since "raw" can be selected in xsl:output and otherwise that default won't work.

OK. Changing it to say "The default is document, unless the relevant xsl:output or xsl:result-document element specifies build-tree=no (applies to XSLT 3.0 only), in which case the default is raw".

16) XSLT 3.0: "static-params" is not mentioned as allowed optional entry

Fixed.

17) Determinism: I think fn:transform will by definition be non-deterministic with XSLT 3.0 and streaming. Though this is not possible with nodes in maps, it can be achieved with xsl:stream.

Agreed, and not only for streaming. We've said that it's imp-def whether the transform runs in the same execution scope; if it's a different execution scope then current-dateTime() can give different results. I've explained this.

18) Talking of streaming, perhaps we should mention this in the notes starting with "Where nodes are passed to or from the transformation"

I tried adding something along these lines, but it didn't seem to be saying anything very useful, so I scrunched it.

19) Currently, only an available node can be passed to the fn:transform function, I assume this is deliberate? If not, can we add "source-location" to the mix?

I think we have enough options available. You can call the transform with map{ 'source-node': doc('xyz.xml') } and that seems good enough to satisfy the use case.

20) A further improvement in the text structure could be to write it up such:

1. For invocation of an XSLT 1.0 processor, the supplied options must include all of the following and nothing else:
  a. A version of 1.0, by setting xslt-version to 1.0
  b. A source item, by setting source-node to a document node
  c. A stylesheet, by setting one of the following: [...]
  d. Other options, by providing zero or more of the following [...]

The idea being: naming each of a, b, c, d makes it a bit clearer, at least to me, what *must* be provided and what these things are.

OK, done.

PS: about determinism, if we want fn:transform to behave deterministic, we probably should say specifically what happens when repeated calls to fn:transffm are made with the same arguments, but with xsl:stream / xsl:merge etc inside the stylesheet.

Likewise, we may need to say something if the global-context-item must be streamed, or the initial-match-selection must be streamed. Not an issue per se (if a supplied node is not streamed, a streaming transformation will succeed normally), but worth mentioning, I think.

I've added a bit on this.

21) base-output-uri: is optional, but we do not say what it defaults to. It makes sense to default it to stylesheet-base-uri, which is common with commandline processors dealing with relative uris in xsl:result-document.

That would cause the primary output to overwrite the stylesheet, wouldn't it? I think the default is "absent", and it's then up to the XSLT spec to say what happens when the base output URI is absent. I've added:

The effect of not supplying a base output URI is defined by the XSLT specification; the implementation
                     <rfc2119>may</rfc2119> supply a default, for example the directory containing the stylesheet, or the
                     current working directory.

22) stylesheet-base-uri, I thought/hoped we would make this relative. There can be situations where the stylesheet path is not known, but library stylesheets are located relative to the current stylesheet, i.e. "../imports". 

Yes, I think both stylesheet-base-uri and base-output-uri should be relative to the static base URI of the fn:transform call.

23) initial-mode: we should perhaps mention it defaults to #unnamed if absent.

Done.

I will endeavour to upload an updated spec before this afternoon's meeting.

Michael Kay
Saxonica

Received on Thursday, 10 December 2015 12:06:37 UTC