- From: Bijan Parsia <bparsia@cs.man.ac.uk>
- Date: Wed, 23 Jul 2008 21:06:34 +0100
- To: Damian Steer <pldms@mac.com>
- Cc: Olivier Rossel <olivier.rossel@gmail.com>, Semantic Web <semantic-web@w3.org>
Hi Damain. On 23 Jul 2008, at 20:13, Damian Steer wrote: [snip] > For large numbers of triples, in my limited experience, the things > that affect RDF load speed Ooo, I got a bit side tracked by the parsing bit. > are: > > The speed of your disk. > The size of your memory. > Building indexes. > Duplicate suppression (triple, node, whatever). > BNode handling. > IRI and datatype checks (if you do them). > Parsing. > > Now parsing is a factor, but it's fairly minor compared with the > basic business of storing the triples. Indeed. > Stores would probably get more benefit from simple processing > instructions like 'this contains no dupes' and 'my bnode ids are > globally unique'. SWI Prolog had, IIRC, a mode to dump its internal structures so you would avoid all that overhead (kinda like an image in Smalltalk or lisp). Obviously databases do this as well. Hard to see that a common format would makea *ton* of sense. I guess you could suppress dups, reconcile bnodes, and a few other things. Indexes? I don't think so. That seems entirely proprietary and appropriately so. Cheers, Bijan "Binary XML 4 Ever!" Parsia.
Received on Wednesday, 23 July 2008 20:04:17 UTC