- From: Vladimir Alexiev via GitHub <sysbot+gh@w3.org>
- Date: Wed, 12 Mar 2025 07:11:48 +0000
- To: public-shacl@w3.org
VladimirAlexiev has just created a new issue for https://github.com/w3c/data-shapes:
== consider computational complexity ==
It would be nice for each new feature (or more realistically a bundle of features, i.e. Profile) to have some idea about its implementation and execution complexity.
Consider this scenario:
- a database of 1, 10 or 100B triples (data at rest), which are assumed valid (eg parts have been validated, parts are from a trusted valid source)
- a transaction of 1, 10 or 100M triples (data in motion)
- a shapes graph of 1 or 10k shapes. Shapes refer to both data in motion, and also data at rest.
How do you validate this scenario in a reasonable time?
That's a difficult question.
- Many implementations use in-memory models ("give me data files, give me shape files, I load them in memory and output a validation report").
- I think all JS implementations are like this. Which is not efficient in the large-data scenario described above (@bergos , I'd love it if you prove me wrong!)
- SPARQL opens up a "door" towards "unknown" complexity. At least I'm not aware of work on assessing the practical complexity of SPARQL queries
- If you need to run say 1M queries coming from `SPARQLConstraint` (part of SHACL 1.0 Core), that will ruin all efficiency
- So much so that, a good advice is https://github.com/Sveino/Inst4CIM-KG/tree/develop/shacl-improved#use-complex-sparqltarget-but-simple-sparqlconstraint : to put the complex query in `SPARQLTarget` (part of SHACL 1.0 SPARQL)
Relevant issues:
- https://github.com/w3c/data-shapes/issues/242 that gives an example of constant-time shape and linear-time shape.
But I didn't define it just for the low complexity: this would be useful in a SHACL profile for Modeling.
- https://github.com/w3c/data-shapes/issues/216 that asks for SHACL Profiles. Now I'll add "based on the complexity of features"
- https://github.com/w3c/data-shapes/issues/235 that removed SPARQL from Core
- https://github.com/w3c/data-shapes/issues/312 that asks for structuring the spec along profiles (and https://github.com/w3c/data-shapes/issues/312#issuecomment-2716744035 that comments on complexity)
Please view or discuss this issue at https://github.com/w3c/data-shapes/issues/321 using your GitHub account
--
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config
Received on Wednesday, 12 March 2025 07:11:48 UTC