Let's write extension functions in XSLT itself from Richard A. O'Keefe on 2001-07-20 (xsl-editors@w3.org from July to September 2001)

From: Richard A. O'Keefe <ok@atlas.otago.ac.nz>
Date: Fri, 20 Jul 2001 07:03:22 +0200
To: xsl-editors@w3.org
Message-Id: <200107190505.RAA359653@atlas.otago.ac.nz>
I downloaded the 12-Dec-2000 draft of the XSLT 1.1 recently.
I notice that there is to be a standard way of defining functions
that can be called in patterns and expressions, provided those
definitions are in JavaScript or Java.  That's really really odd.
You can write part of a style sheet in a language which the processor
might NOT support, but you you can't write it in a language which
the processor MUST support.

I enclose a proposal for defining extension functions in XSLT itself.
It would be absurd to call out to Javascript (non-portably!) just
because some calculation can't be expressed in the expresison language
for want of an 'if' construct.

---------------- cut here ----------------
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 3.2//EN">
<!-- File   : xslt-funs.htm
     Author : Richard A. O'Keefe
     Updated: 2001.07.19
     Purpose: Propose a way to define extension functions in XSLT itself.
-->
<HTML>
<HEAD>
<TITLE>Defining Extension Functions in XSLT Itself</TITLE>
</HEAD>
<BODY>
<H1>Defining Extension Functions in XSLT Itself</H1>
<P>Looking at the 12 December 2000 version of the
XSL Transformations (XSLT) Version 1.1 specification,
we see that there is to be a standard way of defining functions
that can be called in patterns and expressions, provided those
definitions are in JavaScript or Java.</P>
<P>This is a very odd state of affairs.  You can write part of a style
sheet in a language which the processor may <EM>not</EM> support.  But
you can't write it in a language which the processor <EM>must</EM>
support.</P>
<P>Experience with XSLT has already shown that the inability to define
functions in the expression language can result in code duplication.
For example, you cannot say "a DIV1, ..., DIV4 are processed
identically except for using H2, ..., H5 respectively".</P>
<H2>What we can do about it</H2>
<P>I propose a single new top level element in the XSLT namespace,
with three new elements that can appear in it.  First I present the
specification using DTD syntax, then I explain it.</P>
<PRE>
&lt;!ENTITY % type "(Boolean|number|string|node-set|object|any)"&gt;
&lt;!ELEMENT xsl:function (xsl:required*,((xsl:case+,xsl:else)|xsl:value-of))&gt;
&lt;!ATTLIST xsl:function
  name NMTOKEN #REQUIRED
  type %type;  #REQUIRED&gt;
&lt;!ELEMENT xsl:required EMPTY&gt;
&lt;!ATTLIST xsl:required
  name NMTOKEN #REQUIRED
  type %type;  #REQUIRED&gt;
&lt;!ELEMENT xsl:case EMPTY&gt;
&lt;!ATTLIST xsl:case
  test   CDATA #REQUIRED
  select CDATA #REQUIRED&gt;
&lt;!ELEMENT xsl:else EMPTY&gt;
&lt;!ATTLIST xsl:else
  select CDATA #REQUIRED&gt;
</PRE>
<DL>
<DT><CODE>xsl:function</CODE></DT>
<DD>defines an extension function that can
be used in the expression language.  It has a <CODE>name</CODE>, which
is a QName.  If the name has a prefix, that prefix must have been declared
as an extension prefix.  If the name does not have a prefix,
<CODE>#default</CODE> must have been declared as an extension prefix.
The result type says which of the XSLT types the function is supposed to
return.</DD>
<DT><CODE>xsl:required</CODE></DT>
<DD>defines a required formal parameter of the function.  The
association between formal parameters in a definition and actual
parameters in a call is by position.  In this draft, there are no
optional parameters; the number of parameters in a call must match the
number of parameters declared.  Each parameter has a <CODE>name</CODE>
and a <CODE>type</CODE>; each actual parameter is converted to the
required type the same way that built-in functions in the expression
language are converted to expected types.  The interpretation and scope
of argument names is the same as the interpretation and scope of
<CODE>xsl:param</CODE> names; however there are no optional or "keyword"
parameters for XSLT-in-XSLT extension functions.</DD>
<DT>xsl:case</DT>
<DD>is a construct we wouldn't need if only the designers of XPath had
included an "<CODE>if</CODE>" construction.  Each <CODE>xsl:case</CODE>
element is tested in document order until a <CODE>test</CODE> expression
comes out true, and then the value of the corresponding <CODE>select</CODE>
expression is the result of the function.</DD>
<DT>xsl:else</DT>
<DD>must appear at the end of a conditional construction; its
<CODE>select</CODE> attribute specifies what result to return if none of
the <CODE>xsl:case</CODE> tests succeeds.</DD>
<DT>xsl:value-of</DT>
<DD>is used when the body of the expression is unconditional.</DD>
</DL>
<H2>Some examples</H2>
<P>Absolute value</P>
<PRE>
&lt;xsl:function name="my:abs" type="number"&gt;
  &lt;xsl:required name="x" type="number"&gt;
  &lt;xsl:case test="$x < 0" select="-$x"&gt;
  &lt;xsl:else select="$x"&gt;
&lt;/xsl:function&gt;
</PRE>
<P>Converting DIV to H</P>
<PRE>
&lt;xsl:function name="my:div-to-h" type="string"&gt;
  &lt;xsl:required name="node" type="node-set"&gt;
  &lt;xsl:case test="$node[name()='div1'" select="'h2'"&gt;
  &lt;xsl:case test="$node[name()='div2'" select="'h3'"&gt;
  &lt;xsl:case test="$node[name()='div3'" select="'h4'"&gt;
  &lt;xsl:case test="$node[name()='div4'" select="'h5'"&gt;
  &lt;xsl:else select="h6"&gt;
&lt;/xsl:function&gt;
</PRE>
<P>Computing a label from a node</P>
<PRE>
&lt;xsl:function name="my:label" type="string"&gt;
  &lt;xsl:required name="node" type="node-set"&gt;
  &lt;xsl:case test="$node/@id" select="concat('diff-',$node/id)"&gt;
  &lt;xsl:else select="concat('diff-',generated-id($node))"&gt;
&lt;/xsl:function&gt;
</PRE>
<H2>Discussion</H2>
<P>This is basically XML syntax for wrapping "if" around a bunch of
expressions.  It does not appear to require any fundamentally new
constructions in an XSLT processor; an entire definition should be
translatable into the internal form used for XSLT expressions.</P>
<P>There was a note at the end of XSLT 1.0 that conditional expressions
should be considered for a future version of XSLT.  If that were done,
then these function definitions could be simplified by leaving out
<CODE>xsl:case</CODE> and <CODE>xsl:else</CODE>.</P>
<P>Having a type in the <CODE>xsl:function</CODE> element means that
even should the different cases have different "natural" types, they
will be converted to a common form, unless the function is explicitly
typed as "<CODE>any</CODE>".</P>
<P>The <CODE>data-type</CODE> attribute used in sorting has a sufficiently
different range of options that it did not seem wise to call the
<CODE>type</CODE> attribute used here <CODE>data-type</CODE>.</P>
<P>This feature would mean that style sheets could be more portable.
There are very strong reasons <EM>not</EM> to support Java; measurements
with existing free XSLT processors shows Java versions running 2 to 5
times slower than a C version, and taking a great deal of memory for
what seem like simple tasks.  It would be as inappropriate for XSLT
in practice to <EM>demand</EM> Java support as it would be for XSLT
to <EM>forbid</EM> Java support.  It would be especially regrettable
if XSLT were to encourage people to write non-portable scripts simply
in order to perform calculations that should be within its own scope.</P>
<P>Whether <CODE>xsl:value-of</CODE> and <CODE>xsl:else</CODE> should be
distinct elements is arguable.</P>
<P>Optional parameters are not in the present design, just to keep it simple.
Another alternative would have</P>
<PRE>
&lt;!ELEMENT xsl:function (xsl:required*,xsl:optional*,
                        ((xsl:case+,xsl:else) | xsl:value-of))&gt;
&lt;!ATTLIST xsl:optional
  name   NMTOKEN #REQUIRED
  type   %type;  #REQUIRED
  select CDATA   #REQUIRED&gt;
</PRE>
<H2>The End</H2>
<P>If you want to discuss this, my e-mail address is
<A HREF="mailto:ok@cs.otago.ac.nz">ok@cs.otago.ac.nz</A>.</P>
<p><a href="http://validator.w3.org/check/referer"><img border="0"
src="http://www.w3.org/Icons/valid-html32" alt="Valid HTML 3.2!"
height="31" width="88"></a></p>
</BODY>
</HTML>
---------------- cut here ----------------
Received on Friday, 20 July 2001 01:03:58 UTC