Re: Proposed RAND() defn from Andy Seaborne on 2010-12-13 (public-rdf-dawg@w3.org from October to December 2010)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Mon, 13 Dec 2010 13:50:21 +0000
To: Steve Harris <steve.harris@garlik.com>
CC: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <4D06249D.4060509@epimorphics.com>

Removed RAND(expr) from the grammar.

I will be adding the ability to seed the random number generator 
externally to the query execution plan - this makes the main user need 
that I foresee for repeatable queries work, and in such a way that no 
change to the query is needed, which is an advantage as well.

	Andy

On 12/12/10 23:35, Steve Harris wrote:
> FYI, after some more discussion with Andy, and some experiments, my preference is now to not specify any seeding mechanism.
>
> There are at least half a dozen different ways it works in different SQL systems, and they all have peculiar quirks, especially when you try to define them with SPARQL terminology.
>
> When people have had a chance to use RAND() in production systems for a year or more it will hopefully be more obvious how you'd want a seeding function to behave in SPARQL.
>
> - Steve
>
> On 2010-12-02, at 13:53, Steve Harris wrote:
>
>> On 2010-12-02, at 12:12, Andy Seaborne wrote:
>>
>>>> Maybe I don't understand RAND for SQL well enough but I thought that
>>>> RAND() returns different numbers in
>>>>
>>>> FILTER(RAND()>  0.5&&  RAND()<  0.6)
>>>>
>>>> (if you want the same number assign it in some way)
>>>>
>>>> As RAND() returns different numbers, so
>>>
>>> MySQL:
>>>
>>> select rand() as A , rand() as B from T ;
>>>
>>> +-------------------+-------------------+
>>> | A                 | B                 |
>>> +-------------------+-------------------+
>>> | 0.231994651474054 | 0.353741641485823 |
>>> +-------------------+-------------------+
>>> 1 row in set (0.00 sec)
>>
>> Yes, my draft def'n of RAND() specifies that behaviour, but:
>>
>> select rand(1) as A , rand(1) as B from foo;
>> +------------------+------------------+
>> | A                | B                |
>> +------------------+------------------+
>> | 0.40540353712198 | 0.40540353712198 |
>> | 0.87161418038571 | 0.87161418038571 |
>> +------------------+------------------+
>> 2 rows in set (0.00 sec)
>>
>> That's important behaviour for usability, due to the bottom up execution of SQL (and SPARQL).
>>
>> - Steve
>>
>> --
>> Steve Harris, CTO, Garlik Limited
>> 1-3 Halford Road, Richmond, TW10 6AW, UK
>> +44 20 8439 8203  http://www.garlik.com/
>> Registered in England and Wales 535 7233 VAT # 849 0517 11
>> Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
>>
>>
>

Received on Monday, 13 December 2010 13:50:59 UTC