Discussion for Comment JB-7

This comment needs discussion.

On 02/08/11 01:24, Jeen Broekstra wrote:
> 1. String functions
>
> The current set of built-in functions on strings seems rather
> arbitrarily chosen, with little evident use case requirements backing
> them up.
>
> For example, while both fn:string-length and fn:substring are included,
> fn:substring-before and fn:substring-after are not, nor is there any
> form of 'indexOf'-function. This makes it currently not possible in
> SPARQL to determine the substring of a string based on a character match.
>
> My comment is not that these functions should or should not be included
> per se, but rather a question: what criteria did the WG use to decide
> which functions 'make the cut'?

fn:index-of applies to sequences, not strings.  There are no 
index-returning operations in F&O except fn:string-length, at least that 
I can see.

Note that XSD strings are 1-based.  This is going to be confusing when 
many languages use 0-based strings.

Is anyone wishing to argue for including

STR?? = fn:substring-before
STR?? = fn:substring-after
STRINDEXOF = ????

or are we-the-WG satisfied with the current set of functions?

> 2. Hash functions
>
> Perhaps my strongest problem with the current Working Draft is the
> inclusion of 6 variations for calculating a hash. Arguably calculating a
> hash is a _very_ outlying use case that comes up rarely in practical
> applications of SPARQL. I'm not denying there are valid use cases for
> it, but adding six different varieties seems, frankly, outlandish.
>
> There is a practical consideration for me in this as well: on the Java
> platform, SHA-224 in particular is not supported by the default
> cryptography architecture. The fact that SPARQL includes it forces me to
> add a third-party dependency to my SPARQL implementation for a feature
> that very few users will ever need. I find this wasteful and an
> unncessary burden, both on implementors and on users of the software.
>
> Given that the SPARQL specification supports the adding of custom
> functions, so that any vendor who needs to can extend the language, I
> would suggest that this kind of niche functionality has no place in the
> core spec and should be removed, or at the very least only a minimal set
> of hash functions (2 or 3, tops) should be required. In picking this
> subset, the WG should IMHO consider which algorithms are most commonly
> used and supported on various platforms.

I have some sympathy with this.  I don't have enough experience to knwo 
what SHA2 functions are commonly used.  Does anyone have some input here?

An alternative is to define a single function for SHA2 and name the variant.

SHA2("algName", ?string)


(of course, URIs for names would be better but it is a fixed set of 4 
names).

SHA-224 and SHA-384 are not simply truncated versions of SHA-256 and 
SHA-512 because they have different intial values.

Thoughts?

http://en.wikipedia.org/wiki/SHA-2

>
> Regards,
>
> Jeen Broekstra
>
>

	Andy

Received on Tuesday, 13 September 2011 10:13:12 UTC