- From: Steve Harris <steve.harris@garlik.com>
- Date: Wed, 1 Dec 2010 12:56:35 +0000
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: SPARQL Working Group <public-rdf-dawg@w3.org>
On 2010-11-29, at 21:49, Andy Seaborne wrote: > On 29/11/10 12:38, Steve Harris wrote: >> N.B. I'm not sure that a SQL-style definition of RAND(seed) is really >> practical to define in a SPARQL context without changing a lot of other >> things. >> >> Though there's a Solution Sequence, there's nothing that requires the >> SPARQL engine to execute FILTER expressions in any particular order so >> far as I can tell. We could either drop this feature (not my >> preference), or relax the wording — if this is an issue. Relaxing the >> wording would make it hard to test. Thoughts? >> >> - Steve >> >> ---- >> >> RAND >> >> The RAND function returns an xsd:double in the range [0,1), i.e. 0 ≤ >> RAND() < 1. The return value may be generated using some stochastic >> process, or a pseudorandom sequence. >> >> If RAND() is called with no arguments, then it returns a potentially >> different random/psuedorandom value for each invocation. >> >> If RAND() is called with a numeric argument, then the argument is used >> as a seed value, returning a consistent value in [0,1) for each solution >> in the solution sequence for which it is evaluated. Such that, for a >> given seed RAND(seed) will return the same value whenever it's invoked >> for evaluation of the first solution in the solution sequence, and a >> possibly different value consistent value for the second solution, and >> so on. > > Maybe we can specify RAND(seed) by simply saying that it will generate a pseudorandom sequence with the suggestion ("SHOULD") generate the same sequence on each run as a debugging aid. This decouples it from solution sequences. A "SHOULD" is probably a good idea. It's not just a debugging aid though, it's for repeatability generally. > An implementation can be simply a random number generator like srand(N). I'm not sure who's / which srand(n) you're referring to. The key thing is that you get the same return value twice if you do something like: FILTER(RAND(1) > 0.5 && RAND(1) < 0.6) If an implementation can't be consistent with it's RAND(?n) results from execution run to execution run, that's maybe OK. I've only used RAND(s) in SQL a handful of times, so I'm not familiar with all the uses. > Obviously, an implementation that changes execution plan based on external factors (e.g. load, RAM available or somethign smart like that) could potentially change the number of calls to RAND but that's a bit > > (it already has to worry about RAND not being a strict function - Steve -- Steve Harris, CTO, Garlik Limited 1-3 Halford Road, Richmond, TW10 6AW, UK +44 20 8439 8203 http://www.garlik.com/ Registered in England and Wales 535 7233 VAT # 849 0517 11 Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
Received on Wednesday, 1 December 2010 12:57:11 UTC