Steps you can do in XSLT (Was:Comments from the XSLT WG on the XProc Last Call Document)

Speaking only for myself...

Sharon Adler wrote:
> From: XSLT WG

Thank you for taking the time to read and comment on the XProc Last Call 
draft.

> 2. The "Required Steps" in the "Standard Step Library" includes a large
> number of specific tasks that typically have been addressed by XSLT:
> 
> For example:
>    - rename namespaces
>    - compare --> two documents
>    - insert --> insert data before/after some nodes
>    - delete --> a subtree
>    - rename --> a node
>    - replace --> replace a subtree with another subtree
>    - pack --> aka. merge two documents
>    - set attributes --> attributes manipulation
> 
> We note that one of the Required Steps also performs an XSLT
> Transformation.
> 
> The XSL WG has grave concerns about the duplication of functionality.  The
> WG would be extremely disappointed if this were the start of yet another
> transformation language.  We are not sure how these constructs would
> interact with XSLT.

Yes, XProc is a transformation language. It's hard to see how we could
address our remit, of producing a language that specifies (pipelines of)
XML processing, *without* being a transformation language. After all, if
you process XML and produce XML then you are, in effect, *transforming* XML.

I'm not going to argue for any of the above steps in particular, but I
think it's misguided to say "you can't have a step that does X because
you can use XSLT to do X".

First, if we *did* have that rule then we'd get into constant
disagreements about what you can and can't do with XSLT and therefore
what XProc can or can't have as part of its step library and possibly
end up with practically nothing, since (in the right hands) XSLT can do 
practically everything.

Second, even if we had such a rule, it wouldn't stop people from
defining their own versions of steps like those above, most of which are
the kind of pre/post-processing "tweaks" that might be required before
transformation/validation. It's just that users would define them in
their own pipelines, using XSLT to implement them. This is more work for
the users (who have to write the stylesheets that implement the steps)
and leads to less efficient pipelines (a native implementation of one of
the transformations above is always going to be faster than an XSLT
implementation, whether it streams or not). Or implementers would define
implementation-specific versions, leading to interoperability problems
for users.

So, personally, I'd accept the argument "this step isn't worth having as
a standard step because it'll hardly ever be used", but I don't think we
can exclude steps from the standard library purely because they do
things that you *can* do in XSLT.

> We could understand if the tasks overlapping with XSLT functionality would
> have guaranteed streamability; however this is not the case.  For example,
> in section 7.1.18 Rename, "match" is an XSLT MatchPattern that, in the 
> general case,
> is not streamable.

Our requirement on streamability is:

   "An XML Pipeline should allow for the existence of streaming pipelines
    in certain instances as an optional optimization." [1]

If I remember rightly, our decision was to support full expressions and
patterns, and treat the detection of streamability as an optional
optimisation.

There were three reasons for this. First, it would significantly delay
XProc if we had to define which patterns/expressions are streamable and
which aren't. Second, it would force users to learn which
patterns/expressions are usable in XProc and which aren't, raising the 
barrier to use. Third, it would mean XProc implementers would have
to specialise XPath libraries to only work on the streamable subset,
raising the barrier to implementation.

Taking the same "Rename" example, surely it's a lot easier to detect
whether or not the <p:rename> step uses a match pattern that's
streamable than it is to detect that a given stylesheet that renames
particular nodes is streamable. And an implementation that purely does
renaming can be optimised in other ways because of its specialised
processing. The steps in the standard library do not have to be
streamable in all cases to provide a significant performance benefit
over using generic XSLT.


XProc is "yet another transformation language", but it's one with a very
different mode of operation from XSLT. And one of its main uses is going
to be for choreographing multiple XSLT transformations. I see it as a
support technology, not a competitor and certainly not a replacement.

Cheers,

Jeni

[1]: http://www.w3.org/TR/xproc-requirements/#req-streaming-pipes
-- 
Jeni Tennison
http://www.jenitennison.com

Received on Friday, 26 October 2007 19:11:26 UTC