Re: Why is the RDF-star working group standardising RDF 1.2 and SPARQL 1.2? from Antoine Zimmermann on 2023-01-31 (public-rdf-star-wg@w3.org from January 2023)

From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Date: Tue, 31 Jan 2023 11:37:55 +0100
To: Pierre-Antoine Champin <pierre-antoine@w3.org>, public-rdf-star-wg@w3.org
Message-ID: <107debb6-6eba-b376-baca-dbaff0cba5f1@emse.fr>
Le 27/01/2023 à 18:24, Pierre-Antoine Champin a écrit :
> Antoine,
> 
> a few reactions to some of the points you make
> 
> On 27/01/2023 11:49, Antoine Zimmermann wrote:
>> This is an email I have been wanting to write for a long time.
>> The subject is a rhetorical question, please do not answer it out of 
>> the context of this email.
>>
>> RDF-star is undeniably a success in terms of adoption by companies and 
>> various implementers. It is, for sure, improving the data management 
>> in triplestores. A community group crystallised the specification so 
>> that it could go to standardisation.
>>
>> But instead of standardising RDF-star and SPARQL-star as new standards 
>> on top of RDF and SPARQL, it is trying to squeeze it into RDF and 
>> SPARQL core. I will not talk much about SPARQL-star and concentrate on 
>> the RDF part.
>>
>> Now, for me, RDF-star should play a role similar to what RDF datasets 
>> are playing. 
> And yet, datasets were not introduced as "a new standard on top of RDF", 
> they were added to RDF 1.1 as an integral part of it.

...added to RDF 1.1 *after* having been part of a separate standard 
(SPARQL) for 6 years. Besides, RDF datasets are defined on top of RDF 
graphs without affecting them. They do not change the syntax of RDF, nor 
their semantics, so they do not impact the layers on top of RDF.


>> RDF datasets exist for better data management and query features. They 
>> do not interfere with the way RDF works, nor with any technologies 
>> that go on top of RDF. However, if RDF-star becomes the new RDF, it 
>> very explicitly interferes with everything on top of RDF.
> I agree that we need to figure out a way to design RDF 1.2 with minimal 
> impact on everything that currently exist. And yes, this will probably 
> be more tricky than the addition of datasets in RDF 1.1.
>>
>> Every implementation that relies on RDF processing would have to be 
>> updated, lest they fail to be interoperable any more. I know, from the 
>> RDF 1.1 working group, that there are implementations that require the 
>> assumption that there will be only blank nodes or IRIs in subject 
>> position. There are vendors who consider that adding support for a new 
>> type of things in subject position would have dramatic cost. These 
>> companies would need to be convinced they'll get a lot of money from 
>> RDF-star in order to accept to pay the cost.
>>
>> Then there are standards, or community specifications that rely on 
>> RDF. RDFa, SHACL, SWRL, N3, RIF, R2RML, etc. What is a Web Ontology 
>> Language for RDF-star? If OWL can be used together with embedded 
>> triples, it opens a lot of possibilities, causes a lot of 
>> consequences, and brings a lot of questions. The RDF-star semantics is 
>> not well crystallised. There are people even in the community group 
>> itself that do not like the final version of it. Me first. There isn't 
>> a consensus on how to fix it. An RDF-star community consensus is, I 
>> expect, possible to reach within the duration of this WG. However, 
>> we've open the RDF 1.2 working group, albeit in disguise.
> 
> I resent the implication. Read the charter 
> <https://www.w3.org/2022/08/rdf-star-wg-charter/> again. There are many 
> things in it that leave no doubt, in my opinion, on the fact that the 
> goal was to update the RDF and SPARQL recommendations (if only the list 
> of deliverables, all starting with "RDF 1.2 ..." and "SPARQL 1.2 ...").

There is no doubt indeed, and thus, the RDF-star WG is effectively the 
RDF 1.2 and SPARQL 1.2 WG. I wanted to react to the charter before the 
group started (as I said "This is an email I have been wanting to write 
for a long time."), but this kind of opposing opinions has to be written 
carefully. I said to you, face to face, that standardising RDF 1.2 with 
RDF-star would be a huge challenge. I was too careful and nuanced in my 
wording, because what I was truly thinking was that it would be 
unreasonable.

> One reason to keep the name "RDF-star" for the group was to insist on 
> the fact that other new features were, for the moment, out of scope. In 
> hindsight, that was probably a bad move, creating wrong expectations on 
> the scope of the group.

I understand the rationale, and calling this group the RDF 1.2 WG, or 
RDF/SPARQL 1.2 WG would have been even worse, IMHO. Despite my concerns, 
I can understand a hopeful viewpoint from which one can think that the 
RDF community at large will accept the new version, and the whole RDF 
ecosystem will work fine on top of RDF-star. I can understand it, but I 
do not share this viewpoint, at least not until there are clear specs, 
and sufficient evidence that they do not disturb the other specs on top 
of them. What we have in the CG report does not show this at all.

>>
>> And there comes a big issue. With an RDF-star WG that is implicitly an 
>> RDF 1.2 WG (and SPARQL 1.2 WG), we must accept the RDF community at 
>> large to join the discussion.
> Yes, indeed!
>> Enrico, for instance, was not in the CG and he genuinely wants to 
>> understand how things are supposed to work with quoted triples. The CG 
>> report is not clear about it, as it provides explanatory text that 
>> suggests something, while the formal semantics suggests otherwise. Of 
>> course, I understand Ted's frustration as the CG has indeed talked 
>> about the semantic issues ad nauseam. But if you want to standardise 
>> RDF 1.2, you have to get people from outside the CG. And you can't 
>> just expect that they read the 1,000 pages of discussions and listen 
>> to hours of audio recording of the calls (recordings that don't exist, 
>> by the way).
> 
> I don't think that is what Ted implied either.
> 
> In my view, everything that is in the CG report is open for discussion. 
> But the efforts that the CG has put in it should not be neglected either.

I agree. I do not accuse Ted of shutting down useful discussions. In 
fact, during the calls, I was quite relieved when Ted intervened, 
because it was indeed a bit boring to hear discussions that mirrored 
what was already discussed in the CG a lot. But at the same time, the CG 
discussions on semantics did not reach a clear consensus. So how is an 
outsider supposed to know what can be brought to the debate still? It 
can be dangerous to block all opposition coming from the outside by 
simply saying "it has all been discussed already". What, exactly, has 
been discussed? Are we not missing something? Is everything Enrico (or 
anyone else) said already covered by the past CG discussions?

For the sake of better managing the time of our calls, I invite Enrico 
to investigate the CG discussions (from the mailing list archives or 
Github) and try to keep his reflections concise. But I would also like 
to see a summary of the salient points of these discussions.
I believe that https://github.com/w3c/rdf-star/issues/95 provides a good 
entry point for discussions about RDF-star semantics.

>>
>> Ora, in one of the earlier meetings, said he wished the work be done 
>> in one year. This is, I suppose, doable if the RDF-star group 
>> standardises RDF-star. However, 2 years at least are certainly needed 
>> to standardise RDF 1.2 and SPARQL 1.2, no matter how limited the 
>> charter is. And in this case, I have the impression that some people 
>> underestimate how big the charter is. There seems to be people 
>> thinking that the CG report is the specification that simply needs to 
>> get a stamp of approval. But even if there was a consensus on it, this 
>> specification has a lot of consequences on everything on top of RDF, 
>> especially semantically. There are open questions that are to be 
>> addressed by research, not by W3C WGs.
>>
>> For instance, RDF-star introduces a whole new set of entailment 
>> regimes. Is there expertise in the group about how this is going to be 
>> handled in the SPARQL 1.2 Entailment Regime spec? How does SPARQL-star 
>> with OWL 2 RDF-star-based semantics regime work?
> (I have ideas about that as well, but I'll keep them for a discussion 
> focused on semantics rather than the scope of this group)
>>
>> There is also the issue of the education and teaching. I teach a 
>> Semantic Web course. Imagine I introduce RDF 1.2, with embedded 
>> triples. Then I talk about ontologies, SHACL, and more, and there 
>> aren't embedded triples anymore!
> 
> How do you deal, today, with the fact that SPARQL and OWL are based on 
> RDF 1.0, then? And have this strange notion of "plain literal" that does 
> not exist in RDF 1.1? Also, RDF 1.1 has datasets (as part of the core 
> model, I insist), and yet OWL can only reason about a single graphs...

RDF 1.1 does not change anything to the concrete syntaxes of RDF, so all 
RDF 1.1 data is working perfectly with SPARQL or OWL tools. RDF can only 
reason about a single graph too. Not all concepts of RDF must be used in 
every standards that sit on top of RDF. SHACL does not have to deal with 
named graphs, neither does RDFa 1.1, both standardised on top of RDF 
1.1. However, it seems weird to use the concept of RDF triples, and yet 
only deal with a subset of the possible RDF triples (at least, it should 
be mentioned if not all triples are managed by a new spec on top). 
"plain literal" is one glitch, but it hardly makes a difference.

But you are right: it's not that much of a big deal saying that SHACL, 
OWL, RDFa, etc. do not deal with embedded triples.

> My point is: the situation today is already messy, but we deal with it. 
> We can't evolve the whole semantic web stack in one go. We have to start 
> somewhere.

Indeed! I am precisely saying we can't evolve as much of a chunk of the 
SemWeb stack as the charter is telling us to do! We could propose a 
gentler transition, as written just below.

>> I'd rather introduce RDF (1.1), SPARQL, OWL, SHACL, then introduce the 
>> extension, RDF-star and SPARQL-star, if I want to get deeper in the 
>> data management part.
>>
>> Here is what I would like to see:
>>  - RDF-star, a data exchange model for RDF data management. It's not 
>> replacing RDF, it is complementary to it.
>>  - SPARQL-star, a query language for RDF-star and RDF-star datasets.
>>  - As far as the semantics of RDF-star is concerned, make it 
>> critically minimal. Just interpret embedded triples as arbitrary 
>> resources.(*) If someone wants to do more reasoning, they can just 
>> invent a semantic extension.
>> This way, just like RDF datasets define a data model on top of RDF 
>> without replacing it, with SPARQL a query language for this model, we 
>> would have RDF-star as a data model on top of RDF with a query 
>> language for it, mainly for data management purposes, but which does 
>> not preclude other usages, just like RDF datasets can be used for many 
>> things beyond partitioning graphs. Then there would be no discrepancy 
>> between SHACL and RDF, OWL and RDF, RDFa and RDF, and no need to 
>> define SPARQL-star entailment regimes. SPARQL Service description 
>> could be updated to be able to mention that a system supports 
>> SPARQL-star.
>>
>>
>> Now, I am conscious that this does not help at all going forward in 
>> the direction of the charter, but I needed to say it, and I'd better 
>> say it early.
> 
> On the contrary, I think your concerns, about making the transition 
> painless for existing system, are very valid and must be taken into 
> account.

Thank you.


--AZ

> 
>    pa
> 
>>
>>
>> (*) RDF-star basic semantics would be defined on any RDF-entailment 
>> regime by adding a mapping IT from embedded triples to the set of 
>> resources IR. Under this basic semantics, embedded triples simply act 
>> as distinct names, as if they were IRIs. This does not preclude 
>> extensions where the internal structure of the embedded triples makes 
>> a difference.

-- 
Antoine Zimmermann
ISI - Institut Henri Fayol
École des Mines de Saint-Étienne
158 cours Fauriel
42023 Saint-Étienne Cedex 2
France
Tél:+33(0)4 77 42 66 03
https://www.emse.fr/~zimmermann/
Received on Tuesday, 31 January 2023 10:38:21 UTC