- From: Jan Algermissen <jalgermissen@topicmapping.com>
- Date: Wed, 14 Sep 2005 08:13:09 +0200
- To: Mailing Lists <list@thirdstation.com>
- Cc: semantic-web@w3.org
Hi Mark, On Sep 13, 2005, at 10:46 PM, Mailing Lists wrote: > > Hi all, > > Does anyone on the list have some real-world stories to share about > using RDF and its tools as a backend technology? The company I > work for maintains a database of metadata. I'd like to explore > using RDF instead of our current schemas. I use redland+MySQL as the backend for a mid-size configuration managament database. It works well (though I am still missing transactions) and is supposed to scale. I also store HTML-page size literals in the database without problems. > > For example: I have a lot of data about books. I'd like to > translate the data into RDF/XML and dump it into an RDF database. > Then, taking a particular book, I'd like to query the database to > extract related information like: other books by the same author, > other books with the same subject code, etc. > All that should work fine, as long as you do not perform searches over the literals other than by exact string match. These will usually result in a full scan of the literals table giving you bad performance for reasonably large data sets. > My concerns relate to: > 1) Performance -- Right now we query the database using SQL. > Sometimes it is _very_ slow. That's mainly because the data is > distributed across tables and there are a lot of joins going on. > It seems like using RDF would allow us to use simple queries. That would be an interesting case for RDF - can you extend on that a bit? > > 2) Scalability -- Our triplestore would be HUGE. I'd estimate > 10-20 Million triples. Is that small or large in RDF circles? redland+MySQL is said to scale for millions. > > 3) Productivity -- It's usually easier for me to envision creating > RDF from our source data than massaging the data to fit into our > database schema. The same goes for when I'm extracting data - it > seems like it would be much easier to express my query as a triple > using wildcards for the data I want. > > Any information will be helpful. I'm interested in learning from > other peoples' experiences. > IMHO, the issue of search in RDF databases is not so critical, because if you consequently apply Web technologies to enterprise IT, you very likely end up in a situation where you have multiple tripple databases anyhow. Instead of any distribted search over all the databases, the Web style solution would be a central crawling+search service (a search engine). These are optimized for the kinds of queries that are poorly served by RDF tripple stores. IOW, the whole issue of performance suddenly disappears. HTH, Jan > Thanks, > Mark > > ..oO Mark Donoghue > ..oO e: mark@ThirdStation.com > ..oO doi: http://dx.doi.org/10.1570/m.donoghue > > > > ________________________________________________________________________ _______________ Jan Algermissen, Consultant & Programmer http://jalgermissen.com Tugboat Consulting, 'Applying Web technology to enterprise IT' http://www.tugboat.de
Received on Wednesday, 14 September 2005 06:13:23 UTC