Re: [Graphs] Proposal: RDF Datasets

On 22 Aug 2011, at 20:43, Pierre-Antoine Champin wrote:
>> There are several things I don't like about [2].
>> 
>> 1. It is not an abstract syntax. It is a mix of concrete and abstract syntax. Thus it negates the benefits of having an abstract syntax in the first place. For example, one cannot really describe any operations over such a multigraph representation without appealing to the use of various syntax parsers. And one has to explain what happens if the serialized graph isn't valid in the respective syntax. Etc
> 
> I can see your point, but I think you are being a bit hash on that: the
> proposal is completely independant on any concrete syntax. It only
> requires that such a concrete syntax exists, which is not a strong
> requirement...

I disagree. “You can use any concrete syntax” doesn't make an abstract syntax.

>> 2. It doesn't achieve the goal of standardisation. Different existing multigraph approaches (TriG, SPARQL, etc) would all look differently when expressed according to this proposal. Thus, it doesn't promote interoperability and doesn't actually make working with multiple graphs any easier.
> 
> Well, in RDF people can use multiple vocabularies to represent the same
> domain... While this is not ideal for interoperability, this is key to
> scalability. How is the domain of multi-graphs different from other
> domains?

Multi-graphs is not a domain. Our job is to make it part of the data model.

By analogy: If you are tasked to standardize a power socket and power plug, and you deliver a framework for describing different power sockets and power plugs, then you have failed.

>> 3. I feel that it is actually more complex than the RDF Dataset proposal [1] because it requires the definition of one predicate for every RDF graph serialization, 
> 
> it only *requires* one such predicate; it would probably lead to the
> definition of as many predicates as there will be recommended concrete
> syntaxes, which is still tractable.

I did not say that it's not tractable. I said that it is more complex than [1]. You said that you feel that [1] is too complicated. I point out your inconsistency.

>> as well as additional vocabulary for every multigraph representation.
> 
> This is indeed more complex in a way, as the goal is not to provide a
> single built-in multigraph representation, but the building blocks to
> describe such representations.

This may or may not be a good idea. It's not what this WG is chartered to do.

>> 4. It is clear that actually storing or serializing anything in that way would be a bad idea. 
> 
> You are probably right. However, this is not a problem created by my
> proposal: I'm sure there are several naive ways to implement RDF 2004
> which would prove to be bad ideas.

The naïve way of serializing RDF 2004 is N-Triples. It is surprisingly useful. The naïve way of serializing your proposed multigraph representation is not useful.

I think the same holds for storage, but I won't go into detail making that case.

>> 5. From a pure RDF modeling and semantics point of view, this proposal should use typed literals and not plain/xsd:string literals.
> 
> Do you mean: with the content-type (RDF/XML, Turtle...) as their
> datatype? Why not, though I don't see how this is compellingly superior
> to using specialized properties...

It would work without changing RDF Semantics, just define new datatypes. But yeah, it still has all the flaws I mentioned above.

> To sum it up: I agree that relying on concrete syntaxes is not elegant
> from a theoretical point of view, nor practical for implementation.

Ok, so we're on the same page here.

> And
> if one had to "interpret" it in order to implement it correctly, one
> would probably end up with something that lools like a SPARQL dataset :)
> Not sure they would end up with "URIs only as graph names" nor with a
> "default graph", though...

I'm not sure either. But doing it that way has the advantage of being already implemented in every SPARQL store…

Best,
Richard






> 
> I'll think about it.
> 
> thanks for your feedback
> 
>  pa
> 
>> Best,
>> Richard
>> 
>> 
>> On 22 Aug 2011, at 16:12, Pierre-Antoine Champin wrote:
>> 
>>> As I promissed to Richard during the last TC, I'm reactivating the
>>> thread on his proposal to "lift" the definition of RDF datasets into
>>> from SPARQL to RDF concepts [1]
>>> 
>>> My main concern with this proposal is that it defines a somewhat complex
>>> structure (the dataset) as a primitive concept in RDF. My gut feeling is
>>> that we could instead define more basic concepts, on top of which SPARQL
>>> datasets, SPARQL graph stores, and possibly other structures, could be
>>> defined. In my understanding, this is what the g-* terminology was
>>> aiming at.
>>> 
>>> In this perspective, back in June, I made an alternate proposal [2] for
>>> which I got almost no feedback. In a nutshell, it provides a minimal
>>> vocabulary for reifying RDF graphs into standard RDF, and sketches the
>>> semantics of such a reification. From there, it illustrates how
>>> multi-graphs syntaxes (such as Trig) and models (such as SPARQL
>>> datasets) can be defined on top of it.
>>> 
>>> I know that Richard was concerned about several multi-graph models had
>>> slight differences (e.g. can a BNode be used as a graph name), and his
>>> solution was to endorse one of them and wait for the others to converge.
>>> My proposal is rather to provide the building blocks for everyone to
>>> describe their model in RDF itself, and leave it open for different
>>> models to coexist, which is ok as long as they can all be expressed in
>>> plain RDF.
>>> 
>>> pa
>>> 
>>> 
>>> [1] http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal
>>> [2] http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Quadless-Proposal
>> 
> 
> 

Received on Tuesday, 23 August 2011 09:15:20 UTC