Re: NoSQL and triple stores from Dan Brickley on 2010-02-19 (semantic-web@w3.org from February 2010)

From: Dan Brickley <danbri@danbri.org>
Date: Fri, 19 Feb 2010 09:26:57 +0100
To: Sandro Hawke <sandro@w3.org>
Cc: semantic-web@w3.org
Message-ID: <eb19f3361002190026g59a6b0a6h915b203776a8f563@mail.gmail.com>
Hi Sandro,

On Fri, Feb 19, 2010 at 5:04 AM, Sandro Hawke <sandro@w3.org> wrote:
>
> It looks to me like the NoSQL [1] and RDF communities could gain a lot
> from some collaboration, but I'm no expert on NoSQL or its community.
> I have been curious about it for a while, and there's a meeting in Boston
> in a few weeks [2], so now is the time to get up to speed.  Has anyone
> been working on bridging here?  Like, used Cassandra as an RDF store, or
> something?  Failing that, does anyone else think this is important?

This is interesting. I think you'll find all kinds of people and
attitudes under the NoSQL banner. Some will see us with our 'SELECT
blahblah WHERE' query language, and our result-set binding tables, as
just 'Mo(re)SQL'; others will hopefully see the scruffy pragmatism
behind RDF's formal-looking exterior, and explore RDF as a way of
bridging different data systems. I think it'd be great if you and
others in Boston could attend that [2] meeting.

If you do go, I think one thing to try to communicate is the range of
options RDF gives you: we have numerous stores implemented on top of
plain unaltered SQL, stores that are hybrid of SQL/SPARQL (Virtuoso),
various things built on top of Berkeley DB, or Lucene (eg
https://dev.deri.ie/confluence/display/SIREn/SIREn+Presentation),
Prolog (http://www.swi-prolog.org/pldoc/package/semweb.html); pure
in-memory systems; systems with/without fulltext search, geo
additions, inference/logic; projects to encapsulate normal SQL tables
behind an RDF view (eg. http://www4.wiwiss.fu-berlin.de/bizer/d2rq/
and the new Working Group at W3C
http://www.w3.org/2009/08/rdb2rdf-charter.html );  efforts targetting
massive scalability eg http://blog.larkc.eu/?p=1761 or Javascript
life-in-the-browser
http://www.arielworks.net/works/codeyard/hercules/demo/index.html ...
and the thing that ties it all together is a data model a kid can
understand (http://www.flickr.com/photos/ldodds/2381025770/) and the
use of Web identifiers to join and link.

I think it's also worth emphasising that SPARQL, while important, is
not the only or last word in accessing RDF data, and that this
layering of concerns is by-design; we enjoy and encourage
experimentation since the shared data model allows information to
freely flow between systems, so for example the Sindice/SIREn stuff
might emphasise scaling and offers property-based identity reasoning
(ie. inverse-functional lookups) rather than full SPARQL; or the
Gremlin/Neo4J Linked Data stuff at
http://wiki.github.com/tinkerpop/gremlin/linkeddata-sail is an example
of a system that can eat RDF data but offers a query system based
around path navigation. RDF is well specified and clean enough that
you can always export from any of these systems in triples, load them
into another, and explore a different set of tools and tradeoffs. Only
innocents would expect the same of SQL, sadly.

Also related is http://www.w3.org/TR/IndexedDB/ (and
http://dev.w3.org/html5/webstorage/ ) which seems to be a 'databases
in the browser' spec for those who don't need or want full SQL in the
browser. RDF should be of interest to the NoSQL scene as a standard
that could be used to make lots of otherwise-incompatible non-SQL
systems talk to each other, without necessarily imposing a common
querying or data access interface on them all. SPARQL will also be
interesting to some, particularly those whose disillusionment with SQL
is around schema evolution and extensibility rather than scalability.

Re SQL in the browser,  we should also be aware of
http://dev.w3.org/html5/webdatabase/ - related to the tug-of-love
W3C/WHATWG HTML story:

"This specification has reached an impasse: all interested
implementors have used the same SQL backend (Sqlite), but we need
multiple independent implementations to proceed along a
standardisation path. Until another implementor is interested in
implementing this spec, the description of the SQL dialect has been
left as simply a reference to Sqlite, which isn't acceptable for a
standard. Should you be an implementor interested in implementing an
independent SQL backend, please contact the editor so that he can
write a specification for the dialect, thus allowing this
specification to move forward."

cheers,

Dan

>     -- Sandro
>
> [1] your web searching is likely to be as good as mine :-)
> [2] http://nosqlboston.eventbrite.com/
Received on Friday, 19 February 2010 08:27:30 UTC