Re: Discussion for Comment JB-7 from Matthew Perry on 2011-09-13 (public-rdf-dawg@w3.org from July to September 2011)

From: Matthew Perry <matthew.perry@oracle.com>
Date: Tue, 13 Sep 2011 08:25:40 -0400
To: public-rdf-dawg@w3.org
Message-ID: <4E6F4BC4.4040004@oracle.com>
I also feel the same way about the number of Hash functions. We only use SHA1 and MD5.

- Matt

On 9/13/2011 7:20 AM, Steve Harris wrote:
> On 2011-09-13, at 11:12, Andy Seaborne wrote:
>
>> This comment needs discussion.
>>
>> On 02/08/11 01:24, Jeen Broekstra wrote:
>>> 1. String functions
>>>
>>> The current set of built-in functions on strings seems rather
>>> arbitrarily chosen, with little evident use case requirements backing
>>> them up.
>>>
>>> For example, while both fn:string-length and fn:substring are included,
>>> fn:substring-before and fn:substring-after are not, nor is there any
>>> form of 'indexOf'-function. This makes it currently not possible in
>>> SPARQL to determine the substring of a string based on a character match.
>>>
>>> My comment is not that these functions should or should not be included
>>> per se, but rather a question: what criteria did the WG use to decide
>>> which functions 'make the cut'?
>> fn:index-of applies to sequences, not strings.  There are no index-returning operations in F&O except fn:string-length, at least that I can see.
>>
>> Note that XSD strings are 1-based.  This is going to be confusing when many languages use 0-based strings.
>>
>> Is anyone wishing to argue for including
>>
>> STR?? = fn:substring-before
>> STR?? = fn:substring-after
>> STRINDEXOF = ????
>>
>> or are we-the-WG satisfied with the current set of functions?
> We're unhappy with the 1-based strings, it's caused much confusion in Garlik, and some amongst 4store users.
>
> STRINDEXOF would be useful, the rest I've not seen any demand for.
>
>>> 2. Hash functions
>>>
>>> Perhaps my strongest problem with the current Working Draft is the
>>> inclusion of 6 variations for calculating a hash. Arguably calculating a
>>> hash is a _very_ outlying use case that comes up rarely in practical
>>> applications of SPARQL. I'm not denying there are valid use cases for
>>> it, but adding six different varieties seems, frankly, outlandish.
>>>
>>> There is a practical consideration for me in this as well: on the Java
>>> platform, SHA-224 in particular is not supported by the default
>>> cryptography architecture. The fact that SPARQL includes it forces me to
>>> add a third-party dependency to my SPARQL implementation for a feature
>>> that very few users will ever need. I find this wasteful and an
>>> unncessary burden, both on implementors and on users of the software.
>>>
>>> Given that the SPARQL specification supports the adding of custom
>>> functions, so that any vendor who needs to can extend the language, I
>>> would suggest that this kind of niche functionality has no place in the
>>> core spec and should be removed, or at the very least only a minimal set
>>> of hash functions (2 or 3, tops) should be required. In picking this
>>> subset, the WG should IMHO consider which algorithms are most commonly
>>> used and supported on various platforms.
>> I have some sympathy with this.  I don't have enough experience to knwo what SHA2 functions are commonly used.  Does anyone have some input here?
> Same feelings, but no input. We use SHA1 and MD5 only.
>
> - Steve
>
Received on Tuesday, 13 September 2011 12:26:26 UTC