- From: Sampo Syreeni <decoy@iki.fi>
- Date: Wed, 2 Sep 2009 00:49:45 +0300 (EEST)
- To: Brian Stamper <stamper.10@osu.edu>
- cc: SW-forum <semantic-web@w3.org>
On 2009-09-01, Brian Stamper wrote: > How does the number of triples correlate to the processing time of > ORDER BY; e.g. is it logarithmic, quadratic, exponential, etc.? Is > this an issue of the sort algorithm, or the structure of the data, > or..? In the optimum, in general, and in the average case it is n*log(n) in the number of input bytes to be sorted. If the amount of returned tuples/triples can be bounded to within a constant, hashing can bring that down to n. In general, and in the worst case, it depends very much on the storage organization, and especially the constant in front of the expression can change drastically. Finally, if there is some serious optimization going on, you can bring the partial cost downto the amount of data that needs to be inspected in order to make a decision -- which can be below linear sometimes -- plus the linear cost of data delivered, plus-minus compressibility (which in the offline case can yield you below linear costs as well, in the optimum, but never in average unless your data is certain to be distributed in an uneven fashion). The precise query plan also sometimes yields surprises, because the asymptotically optimal plan rarely coincides with the optimum plan against a finite set of data (and optimizer statistics). This means that in the subasymptotic regime, changes in plan can drastically change the total complexiy, and so cause serious discontinuities/jointness of the partial costs, until the total cost approaches the asymptotic regime in all parameters simultaneously. -- Sampo Syreeni, aka decoy - decoy@iki.fi, http://decoy.iki.fi/front +358-50-5756111, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
Received on Tuesday, 1 September 2009 21:50:35 UTC