Re: TriG being disjoint from Turtle from Jan Wielemaker on 2013-05-17 (public-rdf-comments@w3.org from May 2013)

From: Jan Wielemaker <J.Wielemaker@vu.nl>
Date: Fri, 17 May 2013 14:09:07 +0200
To: Sandro Hawke <sandro@w3.org>
CC: Andy Seaborne <andy.seaborne@epimorphics.com>, <public-rdf-comments@w3.org>
Message-ID: <51961DE3.6020301@vu.nl>

Hi Sandro,

On 05/17/2013 01:38 PM, Sandro Hawke wrote:
> On 05/17/2013 06:00 AM, Jan Wielemaker wrote:
>> On 05/17/2013 11:49 AM, Andy Seaborne wrote:
>>
>> [this fragment is from Charles Greer, not answered by Andy]
>>
>>> 1.  Could the spec be modified to allow TriG to be a superset of
>>> turtle?  Specifically, could the production rules be modified to allow
>>> a set of triples outside of any '{'  '}' to be the same as triples in a
>>> default anonymous graph?  It seems that even now, the rules allow
>>> multiple anonymous graph productions, whose union would be the unnamed
>>> graph.  It would be convenient if we could dispense with these anonymous
>>> curly braces altogether if possible.
>>
>> Having implemented TriG yesterday on top of the Turtle parser, I must
>> say that I was happily surprised that TriG does not allow for triples
>> outside {}.  This means you can detect whether a document is a Turtle
>> or TriG document at the first triple.
>
> Why do you want to do that?      I'm imagining a world where people load
> data by URL, not necessarily knowing if it's going to have named graphs
> in it.
>
> I'd think in a load_graph operation, you'd accept TriG as well, using
> the default graph as the output graph.   Maybe have a flag about whether
> to ignore or raise on error if there are some named graphs as well.
>
> And in a load_dataset operations, I'd think you'd accept Turtle as well,
> and just not get any named graphs out of it.

I am not yet sure.  Having to deal with files, loading of which can
create or extend multiple graphs is something new in the design of
SWI-Prolog's RDF store.  There are two things for which I do not yet
have a good answer: implementing `unloading' the data and dealing with
the persistent backup.

The system currently loads a source into a named graph named after the
source. After loading, the graph is saved in a fast and compact binary
format into a file named after the graph-name. Subsequent modifications
are saved in a `journal' file, also named after the graph-name.
Unloading a source finds the graph, removes all triples from memory and
deletes the backup files.

This schema won't fly easily with TriG files.  TriG files can create
multiple graphs and/or add triples to multiple graphs.  TriG files are
also likely to change the granularity of named graphs, which makes the
file-per-named-graph backup module inadequate.  I don't know yet how I'm
going to solve that, but I think it is likely that knowing beforehand
that I'm dealing with a TriG file will be useful information.

 Cheers --- Jan

P.s. still hoping for an
         @format <http://www.w3.org/TR/2013/CR-turtle-20130219/> .
 or similar.

Received on Friday, 17 May 2013 12:09:40 UTC