RE: adding addressing capabilities to the DOM from keshlam@us.ibm.com on 2000-04-21 (www-dom@w3.org from April to June 2000)

From: <keshlam@us.ibm.com>
Date: Fri, 21 Apr 2000 18:29:15 -0400
To: www-dom@w3.org, "Xml-Dev@Xml. Org" <xml-dev@xml.org>
Message-ID: <852568C8.007B8030.00@D51MTA03.pok.ibm.com>
> It seems to me that Joe is speaking more to the case where the DOM is
> completely in memory.

I admit I'm probably slightly biased in that direction, but I think I'm
keeping the other models in mind. The DOM is only an API; what's behind it
is up to the individual implementation.

> As someone who has implemented a DOM on top of a
> database, I have to agree with Francis here and say that providing a way
to
> delegate responsibility for evaluating XPaths (or other types of queries
in
> the future) into the implementation is vital.

I'm not disagreeing, actually. For performance reasons, it's generally
desirable to push query-type operations as close to the raw data as
possible. But the problem is probably not as simple as it appears at first
glance.

* If you're concerned about performance, you probably want to be able to
store "compiled" forms of your queries. Something could be done with
caching, but something more intelligent could be done under management of
higher-level code.

* What's the basic nature of queries -- meaning simple things like "are the
results presented in document order" -- and what does that imply for how
they mesh with the currently available return types in the DOM?

* Does the result have to be "live", as the existing filtered DOM views are
-- ie, if I relocate or remove a node in my document, does the query result
update itself or do I have to repeat the query to get the new status?

There may be others. I'm not sure that saying "only XPath is supported"
really simplify the query API significantly; seems to me that at best it
allows some of the questions to be swept under the carpet, very
temporarily... and leaves us with a call which is at best a convenience
routine, and at worst may have not-completely-compatable behavior with
whatever the final answer turns out to be.



Note that a stopgap might be doable using existing DOM APIs. It might be
possible to create a Filter class for XPaths, and then special-case the
createNodeIterator/createTreeWalker to recognize that specific filter and
instantiate Traversal objects which perform the XPath query at a lower
level. In fact, you could have a whole subclass of Filters which operate by
translating their queries into database-compatable form, and Traversal
objects which recognize them, issue those queries, walk their results, and
(if necessary) reissue the queries or otherwise resynchronize when the
document is mutated.

One cute possibility about this approach is that the same filter-factory
interface could also produce "normal" filters for evaluating these queries
in DOMs that don't have the internal query engine, so they would still
produce reasonable traveral results (albeit more slowly).

This is one of the directions the Query Editorial Team was investigating,
before it decided that queries weren't well enough understood yet and just
getting a good definition of traversal was going to be difficult enough.
I'm not sure whether there was any consensus on this being a _recommended_
direction, but it's one of the approaches to implementing traversal filters
that we kept in mind. In fact, one of the reasons you can't change filters
in mid-traversal is to enable this sort of architecture.



I do think that having a Query API would be a Good Thing. But I worry that
there are enough design issues here that an off-the-cuff design is unlikely
to make anyone very happy for very long.

______________________________________
Joe Kesselman  / IBM Research
Received on Friday, 21 April 2000 18:29:33 UTC