- From: Dave Reynolds <der@hplb.hpl.hp.com>
- Date: Wed, 15 Jan 2003 17:47:24 +0000
- To: Bob MacGregor <macgregor@ISI.EDU>
- CC: www-rdf-interest@w3.org, tar@ISI.EDU, hans@ISI.EDU
Bob, > >What is the space complexity of the algorithm like? > > The contexts are organized into a hierarchy, and numbered so > that a parent's number is always smaller than the numbers of its > descendants. Some sort of data structure is needed that can quickly > determine, given two contexts C1 and C2, whether C1 inherits C2. > One could imagine that the space requirements for this would be > superlinear in the worst case, but they should be linear (in the number > of contexts) for the average case. > > Each statement/triple points to a list of contexts where it has > appeared within an assertion. If a triple [S P O] is asserted to be true in > contexts C1, C2, and C3, then there is a triple object that points > to a list containing C1, C2, and C3 (ordered according to the > numbers they are assigned, largest number first). So you have structure sharing in the sense that there is only one triple object which can appear in multiple contexts. In the case where there are lots of triples (N) and lots of contexts (M) then the worst case cost is dominated by those membership lists which would be worst case O(NxM). However, I imagine the lists also structure share, and that the ordering helps with that, so the average case will be *much* better than that. Presumably there is also an additional index structure of sort to allow you to get the constant-time truth value lookup - that also needs to have good scaling properties. > If we were using a quad representation, then memory is linear in > the number of quads. Yes ... but presumably the number of quads is NxM not just N. In your case it sounds like M could grow quite large which might be a problem. Dave
Received on Wednesday, 15 January 2003 12:47:57 UTC