XSLT 2.0: Grouping proposal

Hi,

The XSLT 2.0 WD does a great job of describing the various types of
grouping that might be required, and I really like the flexibility
that you get with the current-group() function. However, the current
design of xsl:for-each-group reminds me of the old method of sorting
(as immortalised in MSXML), where you had a sort-by attribute on
xsl:for-each.

Taking my lead from the way sorting was eventually handled, I'd like
to propose an alternative: using an xsl:group element in a similar
way to the way xsl:sort is used, within either xsl:for-each or
xsl:apply-templates.

I think this alternative would be easier to learn because of its
similarity to xsl:sort, and more flexible because it can be used with
xsl:apply-templates and because it means you can do several levels of
grouping at the same time (though admittedly the latter is not as
important as it is with xsl:sort).

The details aren't particularly important right now - it's grouping
with something like xsl:sort rather than like xsl:for-each that's the
main point - but to give a better idea of where I'm coming from...

Any xsl:for-each or xsl:apply-templates would contain zero or more
xsl:group elements followed by zero or more xsl:sort elements (the
xsl:sort elements control the order in which the groups are processed,
not their items).

The xsl:group element would be empty, with one of two sets of
attributes (though perhaps it would be better to have two distinct
elements instead):

  - select - expression - used as the grouping key for the items
  - collate - yes/no - used to decide whether the grouping is only
    based on the grouping key for the preceding item (default 'yes',
    giving the same effect as 'group-by' in xsl:for-each-group)
  - data-type \
  - lang       > These attributes as in xsl:sort
  - collation /

  - test - used to identify the first or last item in each group
  - break - before/after - used to indicate whether the node that
    tests true (defaults to 'before', so the items that test true
    begin the group)

(I think a 'test' (boolean expression) is more general than a 'match'
(pattern); in particular it means that you could group sequences of
simple-typed values in the same way.)

Some examples (as in the WD):

First example:

  <table>
    <tr>
      <th>Position</th>
      <th>Country</th>
      <th>City List</th>
      <th>Population</th>
    </tr>
    <xsl:for-each select="cities/city">
      <xsl:group select="@country" />
      <tr>
        <td><xsl:value-of select="position()"/></td>
        <td><xsl:value-of select="@country"/></td>
        <td>
          <xsl:value-of select="current-group()/@name" separator="," />
        </td>
        <td><xsl:value-of select="sum(current-group()/@pop)" /></td>
      </tr>
    </xsl:for-each>
  </table>

or:

  <xsl:template match="cities">
    <table>
      <tr>
        <th>Position</th>
        <th>Country</th>
        <th>City List</th>
        <th>Population</th>
      </tr>
      <xsl:apply-templates select="cities/city">
        <xsl:group select="@country" />
      </xsl:apply-templates>
    </table>
  </xsl:template>

  <xsl:template match="city">
    <tr>
      <td><xsl:value-of select="position()"/></td>
      <td><xsl:value-of select="@country"/></td>
      <td>
        <xsl:value-of select="current-group()/@name" separator="," />
      </td>
      <td><xsl:value-of select="sum(current-group()/@pop)" /></td>
    </tr>
  </xsl:template>

Second example:

  <xsl:for-each select="cities/city">
    <xsl:group select="substring(@name,1,1)">
    <xsl:sort select="substring(@name,1,1)"/>
    <h2>
      <xsl:value-of select="upper-case(substring(@name,1,1))"/>
      <xsl:text> (</xsl:text>
      <xsl:value-of select="count(current-group())"/>
      <xsl:text>)</xsl:text>
    </h2>
    <xsl:for-each select="current-group()">
      <p><xsl:value-of select="@name"/></p>
    </xsl:for-each>
  </xsl:for-each>

(and again you could do it through xsl:apply-templates)

Third example:

  <xsl:template match="body">
    <chapter>
      <xsl:apply-templates select="*">
        <xsl:group test="self::h2" />
      </xsl:apply-templates>
    </chapter>
  </xsl:template>

  <xsl:template match="h2">
    <section title="{.}">
      <xsl:apply-templates select="sublist(current-group(), 2)" />
    </section>
  </xsl:template>

  <xsl:template match="p">
    <para><xsl:value-of select="." /></para>
  </xsl:template>

Fourth example:
  
  <xsl:template match="p">
    <xsl:apply-templates>
      <xsl:group select="self::ul or self::ol" collate="no" />
    </xsl:apply-templates>
  </xsl:template>

  <xsl:template match="ul | ol">
    <xsl:copy-of select="current-group()" />
  </xsl:template>

  <xsl:template match="node()">
    <p><xsl:copy-of select="current-group()" /></p>
  </xsl:template>

We commonly exhort people not to use xsl:for-each, especially with
document-oriented XML structures; I think that especially the latter
two examples illustrate how much more in keeping with the template
style of stylesheet design an xsl:group element would be compared to
xsl:for-each-group.
  
Cheers,

Jeni
---
Jeni Tennison
http://www.jenitennison.com/

Received on Sunday, 23 December 2001 07:40:11 UTC