Re: [Graphs] Proposal: RDF Datasets from Antoine Zimmermann on 2011-08-26 (public-rdf-wg@w3.org from August 2011)

From: Antoine Zimmermann <antoine.zimmermann@insa-lyon.fr>
Date: Fri, 26 Aug 2011 18:39:35 +0200
To: public-rdf-wg@w3.org
Message-ID: <4E57CC47.9060001@insa-lyon.fr>
Pierre-Antoine,


I am in total agreement with what Richard says below. However, I 
sympathise to some extent with your idea. I would be interested to see 
some people define a datatype for serialised graphs, say in Turtle. 
Then, they should brainstorm a few use cases and implement some tools 
around this proposal and see how things are going, gather experiences 
and come back in a few year with a report and possibly a proposal for 
standardisation.

Start by defining a datatype for Turtle graph literals:
  - lexical space is the set of valid Turtle documents;
  - value space is the set of RDF graphs;
  - L2V is the mapping from Turtle to RDF graph, as defined in th Turtle 
spec.

Of course, you can do the same for other syntaxes, but I think Turtle 
best fits.

Then you may need to introduce a set of terms like rdf:Graph, 
rdf:serialisation, etc... This set of terms should be crafted in 
function of the experience that the group gather by trying to deal with 
their use cases.

BUT, this is certainly not something that should be done within this 
working group.


AZ.

Le 22/08/2011 18:54, Richard Cyganiak a écrit :
> Pierre-Antoine,
>
> Thanks for picking this up again.
>
> There are several things I don't like about [2].
>
> 1. It is not an abstract syntax. It is a mix of concrete and abstract
> syntax. Thus it negates the benefits of having an abstract syntax in
> the first place. For example, one cannot really describe any
> operations over such a multigraph representation without appealing to
> the use of various syntax parsers. And one has to explain what
> happens if the serialized graph isn't valid in the respective syntax.
> Etc
>
> 2. It doesn't achieve the goal of standardisation. Different existing
> multigraph approaches (TriG, SPARQL, etc) would all look differently
> when expressed according to this proposal. Thus, it doesn't promote
> interoperability and doesn't actually make working with multiple
> graphs any easier.
>
> 3. I feel that it is actually more complex than the RDF Dataset
> proposal [1] because it requires the definition of one predicate for
> every RDF graph serialization, as well as additional vocabulary for
> every multigraph representation.
>
> 4. It is clear that actually storing or serializing anything in that
> way would be a bad idea. Instead, one wants to use optimized syntaxes
> that can serialize the graph literals without “double serialization”,
> and optimized storage schemes that can actually store and index the
> parsed form of the graph literals. But if that is the case, then why
> not define an abstract syntax that actually reflects these concrete
> syntaxes and storage schemes?
>
> 5. From a pure RDF modeling and semantics point of view, this
> proposal should use typed literals and not plain/xsd:string
> literals.
>
> Best, Richard
>
>
> On 22 Aug 2011, at 16:12, Pierre-Antoine Champin wrote:
>
>> As I promissed to Richard during the last TC, I'm reactivating the
>> thread on his proposal to "lift" the definition of RDF datasets
>> into from SPARQL to RDF concepts [1]
>>
>> My main concern with this proposal is that it defines a somewhat
>> complex structure (the dataset) as a primitive concept in RDF. My
>> gut feeling is that we could instead define more basic concepts, on
>> top of which SPARQL datasets, SPARQL graph stores, and possibly
>> other structures, could be defined. In my understanding, this is
>> what the g-* terminology was aiming at.
>>
>> In this perspective, back in June, I made an alternate proposal [2]
>> for which I got almost no feedback. In a nutshell, it provides a
>> minimal vocabulary for reifying RDF graphs into standard RDF, and
>> sketches the semantics of such a reification. From there, it
>> illustrates how multi-graphs syntaxes (such as Trig) and models
>> (such as SPARQL datasets) can be defined on top of it.
>>
>> I know that Richard was concerned about several multi-graph models
>> had slight differences (e.g. can a BNode be used as a graph name),
>> and his solution was to endorse one of them and wait for the others
>> to converge. My proposal is rather to provide the building blocks
>> for everyone to describe their model in RDF itself, and leave it
>> open for different models to coexist, which is ok as long as they
>> can all be expressed in plain RDF.
>>
>> pa
>>
>>
>> [1]
>> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal
>> [2]
>> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Quadless-Proposal
>
>


-- 
Antoine Zimmermann
Researcher at:
Laboratoire d'InfoRmatique en Image et Systèmes d'information
Database Group
7 Avenue Jean Capelle
69621 Villeurbanne Cedex
France
Tel: +33(0)4 72 43 61 74 - Fax: +33(0)4 72 43 87 13
Lecturer at:
Institut National des Sciences Appliquées de Lyon
20 Avenue Albert Einstein
69621 Villeurbanne Cedex
France
antoine.zimmermann@insa-lyon.fr
http://zimmer.aprilfoolsreview.com/
Received on Friday, 26 August 2011 16:40:05 UTC