Re: Discussion for Comment JB-7 from Steve Harris on 2011-09-13 (public-rdf-dawg@w3.org from July to September 2011)

From: Steve Harris <steve.harris@garlik.com>
Date: Tue, 13 Sep 2011 12:20:39 +0100
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-Id: <18362EB1-5422-4DBA-9542-196BE447E1E2@garlik.com>

On 2011-09-13, at 11:12, Andy Seaborne wrote:

> This comment needs discussion.
> 
> On 02/08/11 01:24, Jeen Broekstra wrote:
>> 1. String functions
>> 
>> The current set of built-in functions on strings seems rather
>> arbitrarily chosen, with little evident use case requirements backing
>> them up.
>> 
>> For example, while both fn:string-length and fn:substring are included,
>> fn:substring-before and fn:substring-after are not, nor is there any
>> form of 'indexOf'-function. This makes it currently not possible in
>> SPARQL to determine the substring of a string based on a character match.
>> 
>> My comment is not that these functions should or should not be included
>> per se, but rather a question: what criteria did the WG use to decide
>> which functions 'make the cut'?
> 
> fn:index-of applies to sequences, not strings.  There are no index-returning operations in F&O except fn:string-length, at least that I can see.
> 
> Note that XSD strings are 1-based.  This is going to be confusing when many languages use 0-based strings.
> 
> Is anyone wishing to argue for including
> 
> STR?? = fn:substring-before
> STR?? = fn:substring-after
> STRINDEXOF = ????
> 
> or are we-the-WG satisfied with the current set of functions?

We're unhappy with the 1-based strings, it's caused much confusion in Garlik, and some amongst 4store users.

STRINDEXOF would be useful, the rest I've not seen any demand for.

>> 2. Hash functions
>> 
>> Perhaps my strongest problem with the current Working Draft is the
>> inclusion of 6 variations for calculating a hash. Arguably calculating a
>> hash is a _very_ outlying use case that comes up rarely in practical
>> applications of SPARQL. I'm not denying there are valid use cases for
>> it, but adding six different varieties seems, frankly, outlandish.
>> 
>> There is a practical consideration for me in this as well: on the Java
>> platform, SHA-224 in particular is not supported by the default
>> cryptography architecture. The fact that SPARQL includes it forces me to
>> add a third-party dependency to my SPARQL implementation for a feature
>> that very few users will ever need. I find this wasteful and an
>> unncessary burden, both on implementors and on users of the software.
>> 
>> Given that the SPARQL specification supports the adding of custom
>> functions, so that any vendor who needs to can extend the language, I
>> would suggest that this kind of niche functionality has no place in the
>> core spec and should be removed, or at the very least only a minimal set
>> of hash functions (2 or 3, tops) should be required. In picking this
>> subset, the WG should IMHO consider which algorithms are most commonly
>> used and supported on various platforms.
> 
> I have some sympathy with this.  I don't have enough experience to knwo what SHA2 functions are commonly used.  Does anyone have some input here?

Same feelings, but no input. We use SHA1 and MD5 only.

- Steve

-- 
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD

Received on Tuesday, 13 September 2011 11:21:16 UTC