[dxwg] SHACL and DCAT profiling (#1387)

bertvannuffelen has just created a new issue for https://github.com/w3c/dxwg:

== SHACL and DCAT profiling ==
Hi,

I have observed that the current explicit making Catalog a subclass of Dataset is hindering the usage of SHACL and also DCAT profiling.

The situation is as follows: in many DCAT profiles additional constraints are placed on datasets. And they are applicable solely to datasets in that catalog, not to the catalog entity. However by enforcing Catalog being a sublcass of Dataset and expressing this in the RDF representation natural SHACL profiling breaks. 

Consider the following example profile: 
   1. A catalog must have a title and a dataset
   2. An dataset must have as value for dc:accessRights <http://publications.europa.eu/resource/authority/access-right/PUBLIC>

The expectation from this profile is that the following example catalog is valid:
```
<http://example.com/catalog> a dcat:Catalog.
<http://example.com/catalog> dc:title "Example catalog"@en.
<http://example.com/catalog> dcat:dataset <http://example.com/dataset>.

<http://example.com/dataset> a dcat:Dataset.
<http://example.com/dataset> dc:accessRights <http://publications.europa.eu/resource/authority/access-right/PUBLIC>.
``` 

Unfortunately it isn't. According to SHACL the above RDF is invalid, because the `ex:catalog` has not a value PUBLIC for `dc:accessRights`.

It means that we have to redesign and introduce in any DCAT profile the notion of Datasets which do not have Catalogs as subclass in order to avoid the propagation of any constraint on a dataset to the catalog. 

To test the case yourself: use https://www.itb.ec.europa.eu/shacl/any/upload with following content:

- Content to validate:  https://gist.github.com/bertvannuffelen/33849112b3aa66ce558f77a119ec81a4/raw/c47a27af0bb7db49ecb6ac62c814d5b576cc5931/dcat-profile-ex.ttl
- External shapes: https://gist.github.com/bertvannuffelen/80422851bf44801f8493ab553b625374/raw/365f40d627e7da2ac7d4154022188597d61aa346/dcat-profile.ttl
- External shapes: https://github.com/SEMICeu/DCAT-AP/raw/2.1.0-draft/releases/2.1.0/dcat-ap_2.1.0_shacl_imports.ttl
- Load imports defined in the input? : check it.

The last 2 bullets are important: it loads the [DCAT rdf file](https://www.w3.org/ns/dcat2.ttl) into the SHACL validator and then the error is triggered. If one does not load the  DCAT rdf file, then it is all fine.


For a vocabulary which is aimed to be profiled this is a very annoying situation. Because in any profile one will add additional constraints on Datasets. I also do not know how do I express in a profile that the constraints on the dataset are not applicable to the catalog entity in that profile in the current setting without forcing the introduction of a subclass MyProfileDataset which is not a Catalog. Any suggestions? 





 
 



Please view or discuss this issue at https://github.com/w3c/dxwg/issues/1387 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Thursday, 24 June 2021 10:14:56 UTC