- From: John Lumley <john@saxonica.com>
- Date: Fri, 19 Jul 2013 10:47:01 +0100
- To: public-expath@w3.org
- Message-ID: <51E90B15.9090703@saxonica.com>
In commenting on the original draft proposal, Mike Kay suggested a find
function something like:
bin:find($in as xs:base64Binary?, $offset as xs:integer, $pattern as
xs:base64Binary) as xs:integer?
would be useful to search through binary data and as a complement to
bin:decode-string(). In putting together some spec. proposal changes and
a trial implementation for this, a few points arose that may be worth a
little discussion.
At first I thought: this is an equivalent of fn:index-of($seq,$search),
which returns a sequence of indicies of members of $seq that are equal
to $search. But what we are proposing is /not quite the same/: we're
looking for occurences of a sequence of pattern bytes in the input byte
sequence, whereas fn:index-of treats singleton 'matching'. In our case
if we decided to return all the 'matches', we could have overlap:
bin:find((3,4,4,4,4,5),0,(4,4)) => (1,2,3)
(I have used the octet representation as I think most of us can't read
base64 directly ;-). $offset is zero-based )
That led me to think that it had more in parallel with substring
matching, but then realised there isn't any function fn:find-substring()
- the closest would be build some compound function using
fn:substring-before() or fn:tokenize($in,$pattern) and examine the
return values.
My suggestion is that we stay with the bin:find() function as declared
above, just returning the index of the /first/ occurence, or empty if
none. Those who want /all/ can build a compound iterative/recursive
function using bin:find():
<xsl:function name="bin:find-all" as="xs:integer*">
<xsl:param name="data" as="xs:base64Binary?"/>
<xsl:param name="offset" as="xs:integer"/>
<xsl:param name="pattern" as="xs:base64Binary"/>
<xsl:sequence
select="let $found := bin:find($data,$offset,$pattern)
return
if($found) then ($found,
if($found + 1 lt bin:length($data)) then
bin:find-all($data,$found + 1,$pattern) else ())
else ()"/>
</xsl:function>
John
--
*John Lumley* MA PhD CEng FIEE
john@saxonica.com <mailto:john@saxonica.com>
on behalf of Saxonica Ltd
Received on Friday, 19 July 2013 09:47:20 UTC