RE: some more RDF thoughts from W.M. Jaworski on 2001-06-22 (www-rdf-dspace@w3.org from June 2001)

From: W.M. Jaworski <wmj@gen-strategies.com>
Date: Fri, 22 Jun 2001 13:05:15 -0400
To: "www-rdf-dspace" <www-rdf-dspace@w3.org>, <dspace-code@MIT.EDU>, "Peter Breton" <pbreton@MIT.EDU>
Message-ID: <NFBBJIDFKDEGAKLIOEDOKEJLCBAA.wmj@gen-strategies.com>

[Peter Breton]
The beauty of the triple store mechanism is that you can accomodate ALL
the data in it.

[wmj]
Other store mechanisms (for example relational) also "accomodate ALL
the data in it."

[Peter Breton]
the roundtrips from client/middleware to RDBMS are likely to be some of the
most expensive operations

[wmj]
If the 'client/middleware' is not RDF, than is it another notation? Is RDF
notation a technology looking for the applications?

[Peter Breton]
3) Another road: since queries on an unbounded triple store ....

[wmj]
It seems that the problem is solved by "Associative Model of Data™ -
http://www.lazysoftware.com

BTW I am not an opponent of RDF, I am an outsider looking for insights and
knowledge.

Respectfully,

WMJ


-----Original Message-----
From: www-rdf-dspace-request@w3.org
[mailto:www-rdf-dspace-request@w3.org]On Behalf Of Peter Breton
Sent: Friday, June 22, 2001 10:24 AM
To: dspace-code@MIT.EDU; www-rdf-dspace
Subject: some more RDF thoughts


1) On storage:

The beauty of the triple store mechanism is that you can accomodate ALL
the data in it. You don't need separate mechanisms to store schema
information, taxonomies, and the like: it's all triples, all the way
down. (apologies to those who find this excruciatingly obvious!).

2) On scalability:

It's difficult for me to see how SQL queries on an unbounded generic
triple store will _ever_ scale.

Scalability in an RDBMS is generally achieved precisely by non-generic
methods: pulling data into well-known columns which can be indexed.

There are some tricks that might help, however:

* Perhaps use graph-oriented indexes on databases which support them?
* Somehow offload the RDF processing to the database, since the
roundtrips from client/middleware to RDBMS are likely to be some of the
most expensive operations

I think Postgres could be hacked to do one or both of these.

3) Another road: since queries on an unbounded triple store may always
be problematic, separate the triple stores. One possibility running
through my head is a "double triple store". The first triple store
serves as a cache for the second one (and there could also be an
in-memory cache, like the Collego folks have). The first triple store
could include (as triples, natch!) a description of all the info it has.
"Efficiency" (really, efficient queries) could be thus be achieved by
simply migrating or copying data to one or more triple store caches. And
since the mapping data is tiny, the overhead of figuring out which cache
to query should be minimal (and could be done in-memory).

I say "cache" above, but it could also be migration.

Peter

Received on Friday, 22 June 2001 12:53:27 UTC