[data-shapes] Packaging rules, shapes, and data with a defined execution order (#643) from liviorobaldo via GitHub on 2025-11-07 (public-shacl@w3.org from November 2025)

From: liviorobaldo via GitHub <noreply@w3.org>
Date: Fri, 07 Nov 2025 14:42:54 +0000
To: public-shacl@w3.org
Message-ID: <issues.opened-3600662305-1762526572-noreply@w3.org>

liviorobaldo has just created a new issue for https://github.com/w3c/data-shapes:

== Packaging rules, shapes, and data with a defined execution order ==
In the SHACL Inference Rules Task Force, we are defining the table of contents for the SHACL 1.2 Rules draft.

See [discussion 637](https://github.com/w3c/data-shapes/discussions/637). The provisional table of contents includes "7. Attaching Rules to Shapes", where we should explain how rules and shapes should work together. Below in the discussion, I propose moving this content to [2. Packaging SHACL](https://w3c.github.io/data-shapes/shacl12-profiling/#packaging). Perhaps in "7. Attaching Rules to Shapes" we could just include a brief pointer to [2. Packaging SHACL](https://w3c.github.io/data-shapes/shacl12-profiling/#packaging).

Concerning how rules and shapes should be executed together, in [discussion 603](https://github.com/w3c/data-shapes/discussions/603#discussioncomment-14727042), based on my experience with SHACL-SPARQL, I proposed to associate _sets_ of shapes with _sets_ of rules. This can be done with a new individual we might call "cluster" (but we should actually pick a better name, like "bundle", "package", "module", etc.), which would group data with shapes, with rules, or with both (three options in total):

```
:cluster-3
rdf:type srl:Cluster;
srl:data (
...
);
srl:ruleSet (
...
);
srl:shapeSet (
...
).
```

If both shapes and rules are present, we need to decide the execution order: should shapes run first, or the rules? As I argued in my recent paper (link provided at the beginning of the discussion), shapes should be executed at least once _after_ the rules. This is because what we want to validate is the _inferred knowledge graph_, consisting of the initially asserted triples plus all triples that can be logically inferred from them. This aligns with the reasonable assumption that if something logically inferred is invalid, then the originally asserted triples must also be considered invalid. Rules serve then to "modularize the effort" when it is too complex, if not even impossible, to write a single shape that internally computes all inferred triples before validation.

However, nothing prevents executing the shapes multiple times. For example, if the shapes already invalidate the initially asserted triples, there is no need to compute the inferred knowledge graph. Shapes could also be re-executed after each "round" of rules execution (rules are iteratively re-executed until no new triple is inferred). However, this approach could be computationally expensive.

In light of this, perhaps the best approach is for the SHACL 1.2 recommendation to eventually specify that executing the shapes is _mandatory_ only at the end of the rules' execution (until saturation), while libraries remain free to _optionally_ execute the shapes at the beginning or after each individual round. They could even allow the programmer to make this choice via special parameters in the relevant functions.

PS. I see that in [2. Packaging SHACL](https://w3c.github.io/data-shapes/shacl12-profiling/#packaging), specifically [2.1 Motivation](https://w3c.github.io/data-shapes/shacl12-profiling/#packaging-motivation), there are already three issues listed. Can someone add this one? I’m not sure how to do it myself (and perhaps I _cannot_, as I don’t have the necessary permissions).

Please view or discuss this issue at https://github.com/w3c/data-shapes/issues/643 using your GitHub account

--
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Friday, 7 November 2025 14:42:55 UTC