- From: Phil Dawes <pdawes@users.sourceforge.net>
- Date: Wed, 20 Oct 2004 10:45:34 +0000
- To: Chris Purcell <cjp39@cam.ac.uk>
- Cc: www-rdf-interest@w3.org
Hi Chris, Chris Purcell writes: > > How are you inputting the triples in the first place? This is where the > MySQL limit bit me, and while I did some poking around to speed things > up, I haven't yet put much time into it. > [poking around] > http://www.srcf.ucam.org/~cjp39/Current/KritTer:2004-08-02+WebLog > > Cheers, > Chris > Depends on what is being input - if it's an insert/update of a small set of assertions, it just uses sql inserts. If it's a large job (e.g. a batch import of 1000000's of statements) it writes them to a file and then uses 'LOAD DATA LOCAL INFILE' to bulk import them. I had a quick look at your weblog post - I assumed from that that you are bulk importing as well. I attempt to solve the duplicate id problem by pre-loading the existing ids into memory, along with hashes of their values. I can then check each literal/uri value asserted against the hash to see if it exists in the database. N.B. you need to lock the table to do this, otherwise you can easily get consistency problems. Cheers, Phil
Received on Thursday, 21 October 2004 10:46:58 UTC