Semantics of p:string-replace

I was thinking about the semantics of p:string-replace in light of
some of the questions Henry asked. I think there's a tricky edge case
that might require an additional option.

The declaration for string-replace is:

  <p:declare-step type="p:string-replace">
    <p:input port="source" sequence="no"/>
    <p:output port="result" sequence="no"/>
    <p:option name="match" required="yes"/>
    <p:option name="replace" required="yes"/>
  </p:declare-step>

This step acts as a filter over the source document. For every node in
the source document, if it matches the XPath match pattern given in
the "match" option, then it is processed by the step. Otherwise, it is
passed through as if this were the p:identity step.

This step processes matched nodes in the following way:

1. The value of the "replace" option is interpreted as an XPath
expression with the matched node as the context node.

2. The string value of the result of evaluating that expression is
determined.

3. That string value is passed to the result document instead of
the matched node.

In the special case where the matched node is an attribute node, only
the attribute's value is replaced by the string value of the result.
The attribute node is preserved.

Given the following source document:

  <div>
    <p class="value1 value2">
      Some
      <a href="http://example.com/">linked</a>
      text.
    </p>
  </div>

Here are some examples:

1.

  <p:string-replace>
    <p:option name="match" value="@class"/>
    <p:option name="replace" value="substring-after(.,' ')"/
  </px:string-replace>

produces

  <div>
    <p class="value2">
      Some
      <a href="http://example.com/">linked</a>
      text.
    </p>
  </div>

2.

  <p:string-replace>
    <p:option name="match" value="a"/>
    <p:option name="replace" value="concat('[',.,']')"/>
  </px:string-replace>

produces

  <div>
    <p class="value1 value2">
      Some
      [linked]
      text.
    </p>
  </div>

3.

  <p:string-replace>
    <p:option name="match" value="p"/>
    <p:option name="replace" value="concat('[',.,']')"/>
  </px:string-replace>

produces

  <div>
    [
      Some
      linked
      text.
    ]
  </div>

4.

  <p:string-replace>
    <p:option name="match" value="p/text()"/>
    <p:option name="replace" value="concat('[',.,']')"/>
  </px:string-replace>

produces

  <div>
    <p class="value1 value2">[
      Some
      ]<a href="http://example.com/">linked</a>[
      text.
    ]</p>
  </div>

5.

  <p:string-replace>
    <p:option name="match" value="p/node()"/>
    <p:option name="replace" value="concat('[',.,']')"/>
  </px:string-replace>

produces

  <div>
    <p class="value1 value2">[
      Some
      ][<a href="http://example.com/">linked</a>][
      text.
    ]</p>
  </div>

I don't think there's any way to produce the result:

  <div>
    <p class="value1 value2">[
      Some
      linked
      text.
    ]</p>
  </div>

unless we add an explicit option to preserve the wrapper:

  <p:string-replace>
    <p:option name="match" value="p"/>
    <p:option name="replace" value="concat('[',.,']')"/>
    <p:option name="preserve-element-wrapper" value="yes"/>
  </px:string-replace>

Which I'm not opposed to. But nor do I feel strongly about.

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com> | Everything should be made as simple as
http://nwalsh.com/            | possible, but no simpler.

Received on Thursday, 26 April 2007 16:58:55 UTC