[Bug 5630] Tuples and maps

http://www.w3.org/Bugs/Public/show_bug.cgi?id=5630





------- Comment #2 from vladimir@nesterovsky-bros.com  2008-04-07 09:57 -------
1. Tuples and maps supplement sequences and allow to express set based
operations in more consise and natural way.

2. In principle tuples can be implemented either with xml tree, or with
sequence that uses some "terminator" items to separate subsequences.

Either of these approaches have weakness: 
it does not express clearly algorithm's intention, 
which is, operate over a collection of sequences.

Moreover, tuples being a part of type system can be implemented 
more efficiently then tuple emulations.

3. Maps to some extent can be implemented with xml tree, and 
key function (in xslt), however, map is more rich, operates 
not only over xml (as key do), but with other types of items.

Map can be used to group sequence by some criterion, and operate with
this grouping on further stage.

Map can be used as a state bag, and may achive the same results in xquery,
as tunnel paramerers do in xslt.

4. Map use case.

Suppose you want to group items per some condition, and 
allow further processing over these groups.

Maps allow you to solve this task like in an example below:

<xsl:template match="/">
  <root>
    <xsl:variable name="cities" as="element()*">
      <city name="Jerusalem" country="Israel"/>
      <city name="London" country="Great Britain"/>
      <city name="Paris" country="France"/>
      <city name="New York" country="USA"/>
      <city name="Brazilia" country="Brazilia"/>
      <city name="Moscow" country="Russia"/>
      <city name="Tel Aviv" country="Israel"/>
      <city name="St. Petersburg" country="Russia"/>
    </xsl:variable>

    <!-- This constructs a map of pairs (country, city).  -->
    <xsl:variable name="map" as="map()" select="
      f:map
      (
        for $city in $cities return
          ($city/string(@country),  $city)
      )"/>

    ... Some processing that leads to call f:process(map)

  </root>
</xsl:template>

<xsl:function name="f:process">
  <xsl:param name="map" as="map()"/>

  <xsl:for-each select="f:map-keys($map)">
    <xsl:variable name="key" as="xs:string"/>

    <country name="{$key}">
      <xsl:sequence select="f:map-value($map, $key)"/>
    </country>
  </xsl:for-each>
</xsl:function>

At present the only acceptable way to solve this task is to constuct temporary
tree,
however this does not preserve items identity.

5. Tuple use case. "Java bean formatter".

After building method's parameters one needs to format them 
one (compact) or the other (verbose) way depending on decision, 
which can be made when all parameters are already built 
(e.g. depending on number of parameters).

At present there is an option is to use "terminator" to separate subsequences:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:t="http://www.nesterovsky-bros.com"
  exclude-result-prefixes="xs t">

  <xsl:output method="xml" indent="yes"/>

  <!-- Terminator token. -->
  <xsl:variable name="t:terminator" as="xs:QName"
    select="xs:QName('t:terminator')"/>

  <!-- New line. -->
  <xsl:variable name="t:crlf" as="xs:string" select="'&#10;'"/>

  <xsl:template match="/">
    <!--
      We need to manipulate a sequence of sequence of tokens.
      To do this we use $t:terminator to separate sequences.
    -->
    <xsl:variable name="short-items" as="item()*">
      <xsl:sequence select="t:get-param('int', 'a')"/>
      <xsl:sequence select="$t:terminator"/>

      <xsl:sequence select="t:get-param('int', 'b')"/>
      <xsl:sequence select="$t:terminator"/>

      <xsl:sequence select="t:get-param('int', 'c')"/>
      <xsl:sequence select="$t:terminator"/>
    </xsl:variable>

    <xsl:variable name="long-items" as="item()*">
      <xsl:sequence select="t:get-param('int', 'a')"/>
      <xsl:sequence select="$t:terminator"/>

      <xsl:sequence select="t:get-param('int', 'b')"/>
      <xsl:sequence select="$t:terminator"/>

      <xsl:sequence select="t:get-param('int', 'c')"/>
      <xsl:sequence select="$t:terminator"/>

      <xsl:sequence select="t:get-param('int', 'd')"/>
      <xsl:sequence select="$t:terminator"/>
    </xsl:variable>

    <result>
      <short>
        <xsl:value-of select="t:format($short-items)" separator=""/>
      </short>
      <long>
        <xsl:value-of select="t:format($long-items)" separator=""/>
      </long>
    </result>
  </xsl:template>

  <!--
    Returns a sequence of tokens that defines a parameter.
      $type - parameter type.
      $name - parameter name.
      Returns sequence of parameter tokens.
  -->
  <xsl:function name="t:get-param" as="item()*">
    <xsl:param name="type" as="xs:string"/>
    <xsl:param name="name" as="xs:string"/>

    <xsl:sequence select="$type"/>
    <xsl:sequence select="' '"/>
    <xsl:sequence select="$name"/>
  </xsl:function>

  <!--
    Format sequence of sequence of tokens separated with $t:terminator.
      $tokens - sequence of sequence of tokens to format.
      Returns formatted sequence of tokens.
  -->
  <xsl:function name="t:format" as="item()*">
    <xsl:param name="tokens" as="item()*"/>

    <xsl:variable name="terminators" as="xs:integer+"
      select="0, index-of($tokens, $t:terminator)"/>
    <xsl:variable name="count" as="xs:integer"
      select="count($terminators) - 1"/>
    <xsl:variable name="verbose" as="xs:boolean"
      select="$count > 3"/>

    <xsl:sequence select="
      for $i in 1 to $count return
      (
        subsequence
        (
          $tokens,
          $terminators[$i] + 1,
          $terminators[$i + 1] - $terminators[$i] - 1
        ),
        if ($i = $count) then () 
        else
        (
          ',',
          if ($verbose) then $t:crlf else ' '
        )
      )"/>
  </xsl:function>

</xsl:stylesheet>

If we allow tuple() type. This task can be solved as:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:t="http://www.nesterovsky-bros.com"
  exclude-result-prefixes="xs t">

  <xsl:output method="xml" indent="yes"/>

  <!-- New line. -->
  <xsl:variable name="t:crlf" as="xs:string" select="'&#10;'"/>

  <xsl:template match="/">
    <!--
      Use a sequence of tuples.
    -->
    <xsl:variable name="short-items" as="tuple()*">
      <xsl:sequence select="tuple(t:get-param('int', 'a'))"/>
      <xsl:sequence select="tuple(t:get-param('int', 'b'))"/>
      <xsl:sequence select="tuple(t:get-param('int', 'c'))"/>
    </xsl:variable>

    <xsl:variable name="long-items" as="tuple()*">
      <xsl:sequence select="tuple(t:get-param('int', 'a'))"/>
      <xsl:sequence select="tuple(t:get-param('int', 'b'))"/>
      <xsl:sequence select="tuple(t:get-param('int', 'c'))"/>
      <xsl:sequence select="tuple(t:get-param('int', 'd'))"/>
    </xsl:variable>

    <result>
      <short>
        <xsl:value-of select="t:format($short-items)" separator=""/>
      </short>
      <long>
        <xsl:value-of select="t:format($long-items)" separator=""/>
      </long>
    </result>
  </xsl:template>

  <!--
    Returns a sequence of tokens that defines a parameter.
      $type - parameter type.
      $name - parameter name.
      Returns sequence of parameter tokens.
  -->
  <xsl:function name="t:get-param" as="item()*">
    <xsl:param name="type" as="xs:string"/>
    <xsl:param name="name" as="xs:string"/>

    <xsl:sequence select="$type"/>
    <xsl:sequence select="' '"/>
    <xsl:sequence select="$name"/>
  </xsl:function>

  <!--
    Format sequence of sequence of tokens separated with $t:terminator.
      $tokens - sequence of sequence of tokens to format.
      Returns formatted sequence of tokens.
  -->
  <xsl:function name="t:format" as="item()*">
    <xsl:param name="tuples" as="tuple()*"/>

    <xsl:variable name="verbose" as="xs:boolean"
      select="count($tuples) > 3"/>

    <xsl:sequence select="
      for $tuple in $tuples return
      (
        tuple-items($tuple),
        if ($i = $count) then () 
        else
        (
          ',',
          if ($verbose) then $t:crlf else ' '
        )
      )"/>
  </xsl:function>

</xsl:stylesheet>

Xslt that uses tuples is more consise, and operates in term of algorithm of
task being solved, whereas
xslt that uses terminators exposes some "lower level" methods to operate with a
sequence of sequences.

6. Combination of tuple and map allow to achive groupping by several items at
once.
Tuples in xsl:sort element allow to sort data by several items at once, which
is a shorthand of a several xsl:sort elements.

Received on Monday, 7 April 2008 09:57:57 UTC