Why is the RDF-star working group standardising RDF 1.2 and SPARQL 1.2?

This is an email I have been wanting to write for a long time.
The subject is a rhetorical question, please do not answer it out of the 
context of this email.

RDF-star is undeniably a success in terms of adoption by companies and 
various implementers. It is, for sure, improving the data management in 
triplestores. A community group crystallised the specification so that 
it could go to standardisation.

But instead of standardising RDF-star and SPARQL-star as new standards 
on top of RDF and SPARQL, it is trying to squeeze it into RDF and SPARQL 
core. I will not talk much about SPARQL-star and concentrate on the RDF 
part.

Now, for me, RDF-star should play a role similar to what RDF datasets 
are playing. RDF datasets exist for better data management and query 
features. They do not interfere with the way RDF works, nor with any 
technologies that go on top of RDF. However, if RDF-star becomes the new 
RDF, it very explicitly interferes with everything on top of RDF.

Every implementation that relies on RDF processing would have to be 
updated, lest they fail to be interoperable any more. I know, from the 
RDF 1.1 working group, that there are implementations that require the 
assumption that there will be only blank nodes or IRIs in subject 
position. There are vendors who consider that adding support for a new 
type of things in subject position would have dramatic cost. These 
companies would need to be convinced they'll get a lot of money from 
RDF-star in order to accept to pay the cost.

Then there are standards, or community specifications that rely on RDF. 
RDFa, SHACL, SWRL, N3, RIF, R2RML, etc. What is a Web Ontology Language 
for RDF-star? If OWL can be used together with embedded triples, it 
opens a lot of possibilities, causes a lot of consequences, and brings a 
lot of questions. The RDF-star semantics is not well crystallised. There 
are people even in the community group itself that do not like the final 
version of it. Me first. There isn't a consensus on how to fix it. An 
RDF-star community consensus is, I expect, possible to reach within the 
duration of this WG. However, we've open the RDF 1.2 working group, 
albeit in disguise.

And there comes a big issue. With an RDF-star WG that is implicitly an 
RDF 1.2 WG (and SPARQL 1.2 WG), we must accept the RDF community at 
large to join the discussion. Enrico, for instance, was not in the CG 
and he genuinely wants to understand how things are supposed to work 
with quoted triples. The CG report is not clear about it, as it provides 
explanatory text that suggests something, while the formal semantics 
suggests otherwise. Of course, I understand Ted's frustration as the CG 
has indeed talked about the semantic issues ad nauseam. But if you want 
to standardise RDF 1.2, you have to get people from outside the CG. And 
you can't just expect that they read the 1,000 pages of discussions and 
listen to hours of audio recording of the calls (recordings that don't 
exist, by the way).

Ora, in one of the earlier meetings, said he wished the work be done in 
one year. This is, I suppose, doable if the RDF-star group standardises 
RDF-star. However, 2 years at least are certainly needed to standardise 
RDF 1.2 and SPARQL 1.2, no matter how limited the charter is. And in 
this case, I have the impression that some people underestimate how big 
the charter is. There seems to be people thinking that the CG report is 
the specification that simply needs to get a stamp of approval. But even 
if there was a consensus on it, this specification has a lot of 
consequences on everything on top of RDF, especially semantically. There 
are open questions that are to be addressed by research, not by W3C WGs.

For instance, RDF-star introduces a whole new set of entailment regimes. 
Is there expertise in the group about how this is going to be handled in 
the SPARQL 1.2 Entailment Regime spec? How does SPARQL-star with OWL 2 
RDF-star-based semantics regime work?

There is also the issue of the education and teaching. I teach a 
Semantic Web course. Imagine I introduce RDF 1.2, with embedded triples. 
Then I talk about ontologies, SHACL, and more, and there aren't embedded 
triples anymore! I'd rather introduce RDF (1.1), SPARQL, OWL, SHACL, 
then introduce the extension, RDF-star and SPARQL-star, if I want to get 
deeper in the data management part.

Here is what I would like to see:
  - RDF-star, a data exchange model for RDF data management. It's not 
replacing RDF, it is complementary to it.
  - SPARQL-star, a query language for RDF-star and RDF-star datasets.
  - As far as the semantics of RDF-star is concerned, make it critically 
minimal. Just interpret embedded triples as arbitrary resources.(*) If 
someone wants to do more reasoning, they can just invent a semantic 
extension.
This way, just like RDF datasets define a data model on top of RDF 
without replacing it, with SPARQL a query language for this model, we 
would have RDF-star as a data model on top of RDF with a query language 
for it, mainly for data management purposes, but which does not preclude 
other usages, just like RDF datasets can be used for many things beyond 
partitioning graphs. Then there would be no discrepancy between SHACL 
and RDF, OWL and RDF, RDFa and RDF, and no need to define SPARQL-star 
entailment regimes. SPARQL Service description could be updated to be 
able to mention that a system supports SPARQL-star.


Now, I am conscious that this does not help at all going forward in the 
direction of the charter, but I needed to say it, and I'd better say it 
early.


(*) RDF-star basic semantics would be defined on any RDF-entailment 
regime by adding a mapping IT from embedded triples to the set of 
resources IR. Under this basic semantics, embedded triples simply act as 
distinct names, as if they were IRIs. This does not preclude extensions 
where the internal structure of the embedded triples makes a difference.
-- 
Antoine Zimmermann
ISI - Institut Henri Fayol
École des Mines de Saint-Étienne
158 cours Fauriel
42023 Saint-Étienne Cedex 2
France
Tél:+33(0)4 77 42 66 03
https://www.emse.fr/~zimmermann/

Received on Friday, 27 January 2023 10:49:56 UTC