Re: Pondering RDF Path from Damian Steer on 2003-09-05 (www-rdf-rules@w3.org from September 2003)

From: Damian Steer <pldms@mac.com>
Date: Fri, 05 Sep 2003 14:33:18 +0100
To: Graham Klyne <gk@ninebynine.org>
Cc: "Sean B. Palmer" <sean@mysterylights.com>, Libby Miller <Libby.Miller@bristol.ac.uk>, www-rdf-rules <www-rdf-rules@w3.org>
Message-ID: <m2he3rtl5t.fsf@evila.danbri.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Graham Klyne <gk@ninebynine.org> writes:

> FWIW, one of my *perceived* advantages for path (and tree) style
> queries is that for many practical applications they seem to generate
> a fairly effeicient triple query sequence.  Which is what I read into
> your "pruning away branches" statement.  But I don't have better than
> anecdotal evidence to back that up, and I'm pretty sure that could
> break down in pathological cases.
>
> (The efficiency consideration here is due to the order in which one
> does triple-matching:  with a simple, flat triple-based query, the
> order of triple matching isn't specifically defined, though I expect
> one would typically use the order in which they are presented.  Using
> a "sensible" ordering means that the number of possible
> variable-bindings considered is reduced.  My intuition is that a
> path-based approach naturally yields such an ordering for much
> "real-world" data, particularly where it is approximately
> tree-structured.)

That's a good point. However there's nothing stopping you optimising
the triple match. For example you might sort by number of variables,
eg (rdf:type http://example.com foaf:document) before
(dc:creator http://example.com ?creator) before (foaf:knows ?creator
?friend), then keep variables together, so you can check quickly if a
binding fails, etc. That's relatively trivial.

There's also an issue with optional triples, which may make the match
order specific (but I think sanity checks may stop this). Something I
need to write up.

> [later]
>
> I've just noted Damian's later posting, which I think confirms
> something I suspected.  I think there are two related but different
> notions of RDF path here:  I (and also Sean, I think) are
> contemplating an expression that operates directly on the RDF graph
> structure to select graph content, where I understand Damian's work to
> be a exposure of RDF as if it were coded in some canonican XML format
> such that an XPath expression can be used to select graph content
> based on that format.  Am I wrong?

That's correct. However the mapping to squish (I hope) shows that
the path could work directly on the RDF graph. It was easier to
implement as a dynamically created DOM tree, however. (nb the tree is
created 'on the run', so it's essentially stepping through the graph).

There are places when my approach breaks down: it can't spoof
unstripped syntax currently (parseType="resource" and <property
anotherProperty="foo">), and there (maybe) issues with the descendant
axis, as I mentioned. But I'm pleasantly surprised that the spoofing
make sense to me as an RDF person who has written virtually no
RDF/XML :-)

Damian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (Darwin)
Comment: Processed by Mailcrypt 3.5.6 and Gnu Privacy Guard <http://www.gnupg.org/>

iD8DBQE/WJCdAyLCB+mTtykRArQgAJ4ga/uk1XW0hAxQzW3VmHL6e+b4bACeMNi7
WmWHdK7dGRryGbdI/f5pNIo=
=0uoV
-----END PGP SIGNATURE-----

Received on Friday, 5 September 2003 09:33:39 UTC