SHACL-based data extraction from a knowledge graph

Hello !

I am facing the following situation :

   - A large knowledge graph with lots of triples
   - A need to export multiple RDF datasets from this large Knowledge
   Graph, each containing a subset of the triples from the graph
   - Datasets are not limited to a flat list of entities with their
   properties, but will each contain a small piece of graph
   - The exact content of each Dataset is specified in SHACL, using
   standard constraints of cardinalities, sh:node, datatype, languageIn,
   sh:hasValue, etc. This SHACL will be used as the source for documenting the
   exact content of each Dataset using [1]

And now the question : can we automate the extraction of data from the
large knowledge graph based on the SHACL definition of our datasets ?
What we are looking for is a guarantee that the extraction process will
produce a dataset that is conformant with the SHACL definition.

Has anyone done something similar ? A naîve approach would be a SPARQL
query generation based on the SHACL definition of the dataset, but I
suspect the query will quickly be too complicated.

Thanks !
Thomas

[1] SHACL Play documentation generator :
https://shacl-play.sparna.fr/play/doc


-- 

*Thomas Francart* -* SPARNA*
Web de *données* | Architecture de l'*information* | Accès aux
*connaissances*
blog : blog.sparna.fr, site : sparna.fr, linkedin :
fr.linkedin.com/in/thomasfrancart
tel :  +33 (0)6.71.11.25.97, skype : francartthomas

Received on Tuesday, 8 March 2022 17:25:39 UTC