- From: Steve Harris <steve.harris@garlik.com>
- Date: Thu, 2 Dec 2010 13:51:05 +0000
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: SPARQL Working Group <public-rdf-dawg@w3.org>
On 2010-12-02, at 11:38, Andy Seaborne wrote: >>> >>> Maybe we can specify RAND(seed) by simply saying that it will generate a pseudorandom sequence with the suggestion ("SHOULD") generate the same sequence on each run as a debugging aid. This decouples it from solution sequences. >> >> A "SHOULD" is probably a good idea. It's not just a debugging aid though, it's for repeatability generally. >> >>> An implementation can be simply a random number generator like srand(N). >> >> I'm not sure who's / which srand(n) you're referring to. > > This one: > > http://www.gnu.org/s/libc/manual/html_node/ISO-Random.html > >> The key thing is that you get the same return value twice if you do something like: >> FILTER(RAND(1)> 0.5&& RAND(1)< 0.6) > > For me, that's not necessary. For predictability, all I require is that each call of RAND(seed) returns the same number at the same point in execution across runs. > > Maybe I don't understand RAND for SQL well enough but I thought that RAND() returns different numbers in > > FILTER(RAND()> 0.5&& RAND()< 0.6) It does, but not if you provide a seed number, the seed gives you a new number, per row. > (if you want the same number assign it in some way) SQL doesn't have per-row assignment, and it's going to be problematic in SPARQL (see below) > As RAND() returns different numbers, so > > FILTER(RAND(1)> 0.5&& RAND(1)< 0.6) > > should, just the same numbers at the same invocation count every run. That doesn't make me comfortable. The implementation in SQL is something like: [in very naive terms, obviously] srand(row_num + seed); return (double)rand() / (double)RAND_MAX+1.0; Otherwise you have issues about execution order, which might not be stable between executions, or even execution phases. Also, OPTIONAL { ?x :a ?y FILTER(RAND(1) < 0.5) } OPTIONAL { ?s :b ?z FILTER(RAND(1) < 0.5) } Is going to have both undesirable, and unpredictable behaviour. BIND(RAND(1) AS ?r) OPTIONAL { ?x :a ?y FILTER(?r < 0.5) } ... Won't work, because of the scoping, right? You could do something with nested OPTIONALs, but anyone who's familiar with SQL's behaviour is not going to be very happy. - Steve -- Steve Harris, CTO, Garlik Limited 1-3 Halford Road, Richmond, TW10 6AW, UK +44 20 8439 8203 http://www.garlik.com/ Registered in England and Wales 535 7233 VAT # 849 0517 11 Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
Received on Thursday, 2 December 2010 13:51:41 UTC