- From: Richard Newman <holygoat@gmail.com>
- Date: Tue, 13 Sep 2005 23:31:17 -0700
- To: Mailing Lists <list@thirdstation.com>
- Cc: semantic-web@w3.org
> Does anyone on the list have some real-world stories to share about > using RDF and its tools as a backend technology? The company I > work for maintains a database of metadata. I'd like to explore > using RDF instead of our current schemas. I do, but I'm not sure if I can talk about them, though :D I will say I'm having great success. > For example: I have a lot of data about books. I'd like to > translate the data into RDF/XML and dump it into an RDF database. > Then, taking a particular book, I'd like to query the database to > extract related information like: other books by the same author, > other books with the same subject code, etc. > > My concerns relate to: > 1) Performance -- Right now we query the database using SQL. > Sometimes it is _very_ slow. That's mainly because the data is > distributed across tables and there are a lot of joins going on. > It seems like using RDF would allow us to use simple queries. Possibly... the queries might be simple, but that doesn't necessarily mean they'll be performant. > 2) Scalability -- Our triplestore would be HUGE. I'd estimate > 10-20 Million triples. Is that small or large in RDF circles? That's fairly large. Leigh Dodds will have something to say on this topic, having used very large sets of triples. The largest I've considered doing in my current work has been around 70 million records (IIRC), each expanding to probably 20-30 triples. I'm yet to actually _try_ it, though -- especially with an in-memory store :) (A reason to do so is that it might be the biggest ever!) > 3) Productivity -- It's usually easier for me to envision creating > RDF from our source data than massaging the data to fit into our > database schema. The same goes for when I'm extracting data - it > seems like it would be much easier to express my query as a triple > using wildcards for the data I want. The development cycle is vastly more flexible -- I'm able to do things with RDF that would take a lot of effort using a RDB. Likewise for queries -- I use Wilbur's path expressions extensively, and I have SPARQL and standard triple-pattern queries to fall back on. Having done a lot of development, I would now find it hard to ever use a relational database again -- for flexibility of development, RDF wins, and for modelling the domain, extending, integrating, and complex queries it's also hands-down the best choice. -R
Received on Wednesday, 14 September 2005 06:31:36 UTC