Re: [Graphs] Proposal: RDF Datasets

Pierre-Antoine,

Thanks for picking this up again.

There are several things I don't like about [2].

1. It is not an abstract syntax. It is a mix of concrete and abstract syntax. Thus it negates the benefits of having an abstract syntax in the first place. For example, one cannot really describe any operations over such a multigraph representation without appealing to the use of various syntax parsers. And one has to explain what happens if the serialized graph isn't valid in the respective syntax. Etc

2. It doesn't achieve the goal of standardisation. Different existing multigraph approaches (TriG, SPARQL, etc) would all look differently when expressed according to this proposal. Thus, it doesn't promote interoperability and doesn't actually make working with multiple graphs any easier.

3. I feel that it is actually more complex than the RDF Dataset proposal [1] because it requires the definition of one predicate for every RDF graph serialization, as well as additional vocabulary for every multigraph representation.

4. It is clear that actually storing or serializing anything in that way would be a bad idea. Instead, one wants to use optimized syntaxes that can serialize the graph literals without “double serialization”, and optimized storage schemes that can actually store and index the parsed form of the graph literals. But if that is the case, then why not define an abstract syntax that actually reflects these concrete syntaxes and storage schemes?

5. From a pure RDF modeling and semantics point of view, this proposal should use typed literals and not plain/xsd:string literals.

Best,
Richard


On 22 Aug 2011, at 16:12, Pierre-Antoine Champin wrote:

> As I promissed to Richard during the last TC, I'm reactivating the
> thread on his proposal to "lift" the definition of RDF datasets into
> from SPARQL to RDF concepts [1]
> 
> My main concern with this proposal is that it defines a somewhat complex
> structure (the dataset) as a primitive concept in RDF. My gut feeling is
> that we could instead define more basic concepts, on top of which SPARQL
> datasets, SPARQL graph stores, and possibly other structures, could be
> defined. In my understanding, this is what the g-* terminology was
> aiming at.
> 
> In this perspective, back in June, I made an alternate proposal [2] for
> which I got almost no feedback. In a nutshell, it provides a minimal
> vocabulary for reifying RDF graphs into standard RDF, and sketches the
> semantics of such a reification. From there, it illustrates how
> multi-graphs syntaxes (such as Trig) and models (such as SPARQL
> datasets) can be defined on top of it.
> 
> I know that Richard was concerned about several multi-graph models had
> slight differences (e.g. can a BNode be used as a graph name), and his
> solution was to endorse one of them and wait for the others to converge.
> My proposal is rather to provide the building blocks for everyone to
> describe their model in RDF itself, and leave it open for different
> models to coexist, which is ok as long as they can all be expressed in
> plain RDF.
> 
>  pa
> 
> 
> [1] http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal
> [2] http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Quadless-Proposal

Received on Monday, 22 August 2011 16:54:53 UTC