- From: Chris Wilper <cwilper@cs.cornell.edu>
- Date: Wed, 14 Sep 2005 03:13:12 -0400
- To: "Mailing Lists" <list@thirdstation.com>, <semantic-web@w3.org>
- Message-ID: <772EF7E386FEDF4FA6E11A9DA703A4D24E01D9@EXCHVS1.cs.cornell.edu>
Hi Mark, I'd suggest you take a good look at Kowari. It really excels at query performance and scalability compared to anything else I've seen in this space. My own testing has been in the 10-20M triple range. I've heard that Kowari can easily handle ten times that, but haven't tested the assertion for myself. Shoot me an email if you're interested and I'd be glad to share some more concrete data. I think you are right on (if a bit bleeding edge) with your approach of moving metadata into a triplestore. Even without inferencing, there are some really nice advantages to going this route. As more people realize these advantages, I think this space will see an increased focus on achieving the kind of scale that the high-end relational databases have seen for years. Cheers, Chris Wilper -----Original Message----- From: semantic-web-request@w3.org on behalf of Mailing Lists Sent: Tue 9/13/2005 4:46 PM To: semantic-web@w3.org Subject: RDF tools as workhorse Hi all, Does anyone on the list have some real-world stories to share about using RDF and its tools as a backend technology? The company I work for maintains a database of metadata. I'd like to explore using RDF instead of our current schemas. For example: I have a lot of data about books. I'd like to translate the data into RDF/XML and dump it into an RDF database. Then, taking a particular book, I'd like to query the database to extract related information like: other books by the same author, other books with the same subject code, etc. My concerns relate to: 1) Performance -- Right now we query the database using SQL. Sometimes it is _very_ slow. That's mainly because the data is distributed across tables and there are a lot of joins going on. It seems like using RDF would allow us to use simple queries. 2) Scalability -- Our triplestore would be HUGE. I'd estimate 10-20 Million triples. Is that small or large in RDF circles? 3) Productivity -- It's usually easier for me to envision creating RDF from our source data than massaging the data to fit into our database schema. The same goes for when I'm extracting data - it seems like it would be much easier to express my query as a triple using wildcards for the data I want. Any information will be helpful. I'm interested in learning from other peoples' experiences. Thanks, Mark ..oO Mark Donoghue ..oO e: mark@ThirdStation.com ..oO doi: http://dx.doi.org/10.1570/m.donoghue
Received on Wednesday, 14 September 2005 07:14:45 UTC