- From: Philip Jägenstedt <philipj@opera.com>
- Date: Fri, 10 Jun 2011 09:07:43 +0200
On Thu, 09 Jun 2011 19:01:24 +0200, Ian Hickson <ian at hixie.ch> wrote: > On Thu, 9 Jun 2011, Philip J?genstedt wrote: >> >> As for the spec, I don't think it can or needs to define the algorithm >> on a >> form suitable form implementation. Something along these lines would be >> much >> clearer for reference: >> >> 1. create a (possibly disconnected) graph of all the items in the >> document (or >> subtree) >> 2. find the strongly connected components >> 3. create a list of "loopy" items: those that are in the same component >> as any >> other item >> >> The traversal would remain mostly as before, but whenever an item is >> encountered, one checks if it is in the list of "loopy" items and if so >> ignores it. Since "loopy" is a global property, you'll see the same >> properties >> regardless of the path taken to reach it, which may or may not be the >> case >> with the current spec. (In any case, it's a nice feature.) > > The main reason I didn't do something like this with the current spec was > that I was trying to minimise the work needed when implementing the API > in > a dynamic situation. The above would imply that any time anything in the > document changed in a way that could affect microdata, you'd have to redo > the whole document before the next time the API was invoked. That seems > expensive. (Consider the WHATWG spec, which has microdata in it and is > about 5MB. Do you really want to crawl the whole document looking for > microdata each time the API is invoked?) > > What I had tried to do when implementing the spec is start at whatever > point in the DOM the API call was related to, and then search for loops > from that point, and drop anything loopy. That's still expensive, but it > at least minimises the total amount of work. > > Does that make sense? > > If the expense isn't a big deal then I don't mind doing it the other way > too, but this seems like an API that we're going to have enough trouble > getting implemented in the first place without giving implementors a > reason to avoid doing it at all. I don't think the spec needs to be giving suggestions for efficient implementation for live collections, because we inevitable won't implement exactly that algorithm anyway. It is a problem if the algorithm doesn't clearly map to some simpler higher-level behavior, as we certainly don't want to emulate some edge-cases just to follow the letter of the spec. But, let's disregard the exact algorithm for a moment and only consider the actual requirement we're suggesting: "any item which can reach itself by following itemrefs should be removed" It seems to me that it's quite possible to check this criteria while traversing using an algorithm of similar structure to what is already in the spec. One issue is that one must first find all the loopy items and then remove them in one step, rather than interleaving the checking/removing. However, it seems that this could be solved by simply flagging them instead of actually removing them, so that they will still take part in later loop-checks. Am I missing something significant about the spec'd algorithm that would make it more efficient than the above? If we just go ahead and show an efficient (enough) implementation of loop removal using the suggested criteria, I assume that would be convincing enough? Or do you really think that itemref is useless and complicated enough that it would be better to throw it overboard? -- Philip J?genstedt Core Developer Opera Software
Received on Friday, 10 June 2011 00:07:43 UTC