W3C home > Mailing lists > Public > public-lod@w3.org > March 2013

Re: "loading multiple .rdf files into a local virtuoso server"

From: Barry Norton <barry.norton@ontotext.com>
Date: Tue, 12 Mar 2013 22:33:25 +0000
Message-ID: <513FAD35.3080903@ontotext.com>
To: public-lod@w3.org
On 12/03/13 21:49, Kingsley Idehen wrote:
> On 3/12/13 4:53 PM, Barry Norton wrote:
>>
>> Such questions really belong on the Virtuoso list, but don't most 
>> triplestores support the SPARQL Graph Store Protocol by now?
>
> Yes, but when you've got a massive collection of RDF files you still 
> need to bulk load from a local directory etc..

As below.

>
>>
>> Most of my (bash) load scripts look like this:
>>
>> for file in *; do curl -H "Content-Type:text/turtle" -T $file 
>> your-server/your-database/rdf-graphs/service?graph=your-graph; done
>
> Yes for small files, no for a massive collection of files or a few 
> very large files :-)

I presume your objection is (small files) the set-up/shutdown overheard, 
and (for large files) that these are not necessarily passed in 
compressed form? Or have you other objections?

I've always meant to look into 'Content-Encoding: gzip', but I've always 
been happy to walk over an uncompressed split of NTriples/NQuads due to 
a combination of: the relative time of transfer relative to 
canonicalisation and indexing; the desire to split very large files 
(e.g. Freebase) to localise errors.

Barry



>>
>> On 12/03/13 20:36, Kalpa Gunaratna wrote:
>>> actually I tried that (I used the procedure to load DBpedia dump 3.8 
>>> which was in gz format as I remember.)
>>>
>>> But when I try to load now a dump of DBLP which has .rdf files as 
>>> the dump when I uncompress it, I do not know how to load the files. 
>>> Following is what I get running bulk load procedure.
>>>
>>> SQL> ld_dir(‘/home/kalpa/Virtuoso/data/datasets/DBLP-RKB/models’, 
>>> ‘*.*’, ‘http://dblp-rkb.org’);
>>> Connected to OpenLink Virtuoso
>>> Driver: 06.01.3127 OpenLink Virtuoso ODBC Driver
>>>
>>> *** Error 37000: [Virtuoso Driver][Virtuoso Server]SQ074: Line 1: 
>>> syntax error at '.' before '*'
>>> at line 1 of Top-Level:
>>> ld_dir(‘/home/kalpa/Virtuoso/data/datasets/DBLP-RKB/models’, ‘*.*’, 
>>> ‘http://dblp-rkb.org’)
>>>
>>>
>>>
>>> On Tue, Mar 12, 2013 at 8:27 PM, Francisco Cifuentes 
>>> <francisco.cifuentes@weso.es <mailto:francisco.cifuentes@weso.es>> 
>>> wrote:
>>>
>>>     Take a look here:
>>>
>>>     http://www.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtBulkRDFLoader
>>>
>>>     Regards
>>>
>>>     Francisco.
>>>
>>>
>>>     2013/3/12 Kalpa Gunaratna <kalpagunaratna@gmail.com
>>>     <mailto:kalpagunaratna@gmail.com>>
>>>
>>>         Hi,
>>>            I have an rdf dump that has data in the form of .rdf
>>>         files. I want to load them into a local Virtuoso server so
>>>         that I can query them using the local sparql endpoint. But I
>>>         see that it is possible to load one RDF/XML file at a time
>>>         using the command "DB.DBA.RDF_LOAD_RDFXML_MT". Since the
>>>         dump has many files, executing this command many times is
>>>         not going to work. What are the other alternatives I have in
>>>         loading them to the server? Thank you in advance for any help!
>>>
>>>         Regards
>>>         Kalpa Gunaratna
>>>
>>>
>>>
>>>
>>>     -- 
>>>     Francisco Cifuentes-Silva
>>>     ------------------------------------
>>>     WESO Research Group
>>>     Facultad de Ciencias
>>>     Universidad de Oviedo
>>>     Tel: +34 985103397
>>>     http://www.weso.es
>>>     http://twitter.com/fcifuentes
>>>
>>>
>>>
>>>
>>> -- 
>>> Regards
>>> Kalpa Gunaratna
>>
>
>
> -- 
>
> Regards,
>
> Kingsley Idehen	
> Founder & CEO
> OpenLink Software
> Company Web:http://www.openlinksw.com
> Personal Weblog:http://www.openlinksw.com/blog/~kidehen
> Twitter/Identi.ca handle: @kidehen
> Google+ Profile:https://plus.google.com/112399767740508618350/about
> LinkedIn Profile:http://www.linkedin.com/in/kidehen
>
>
>
>
Received on Tuesday, 12 March 2013 22:33:58 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:16:30 UTC