Re: [dxwg] referencing named graph of endpoint or RDF quad file (#1241) from mathib via GitHub on 2021-03-11 (public-dxwg-wg@w3.org from March 2021)

From: mathib via GitHub <sysbot+gh@w3.org>
Date: Thu, 11 Mar 2021 16:06:53 +0000
To: public-dxwg-wg@w3.org
Message-ID: <issue_comment.created-796847927-1615478812-sysbot+gh@w3.org>

Hi @andrea-perego !

I have not been following the DCAT progress too close lately. Is there now a modeling approach for the referencing of named graphs in quadstores and RDF quad files? If this is not yet done but still wanted, I'll share below the final modeling approach I took in my research, using a self-made extension of DCAT, i.e. [CDC](https://w3id.org/cdc#). The modeling would be done as follows for default graph and specific named graph distributions (either served via an RDF quad file or quadstore service):

```
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix void: <http://rdfs.org/ns/void#> .
@prefix cdc: <https://w3id.org/cdc#> .

:dataset1 a dcat:Dataset ;
dcat:distribution :dataset1_quadStoreDistr , :dataset1_quadFileDistr .
:dataset1_quadStoreDistr a cdc:DefaultGraphDistribution ; # the content of :dataset1 is available in the default graph of this SPARQL endpoint service
dcat:accessService :mysparqlEndpoint .
:dataset1_quadFileDistr a cdc:NamedGraphDistribution ; # the content of :dataset1 is available in the named graph :myNamedGraph of the TriG file
cdc:graphURI :myNamedGraph ;
dcat:downloadURL <https://mydomain.org/file1.trig> ;
dcat:mediaType <https://www.iana.org/assignments/media-types/application/trig> .
:mysparqlEndpoint a dcat:DataService ;
dcat:endpointURL <http://mydomain.org/sparql> ;
dct:conformsTo <https://www.w3.org/TR/sparql11-query> ;
dcat:servesDataset :dataset1 .
```

`cdc:DefaultGraphDistribution` and `cdc:NamedGraphDistribution` are subclasses of `dcat:Distribution`. If no such specific class is used on a distribution and the RDF file or data service supports quads, the content of the dataset should be assumed to be located in all named graphs and the default graph.

Thus instead of creating different datasets, one `dcat:Dataset` with optionally different distributions suffices as the content of the reflected RDF dataset remains the same. Each distribution then indicates if the content of the dataset is spread over the entire triplestore/quadstore or RDF triple/quad file, or if it's located in a specific named graph or the default graph of a quadstore or RDF quad file. I refrained from reusing SD (SPARQL Description) terminology as I wanted to have a solution that is applicable beyond SPARQL endpoint services, e.g. for quad RDF files and quad pattern fragment servers (triple pattern fragment servers with quad support, such as [this](https://github.com/LinkedDataFragments/Server.js/) one). The SD modeling patterns are also relatively ackward to use in combination with the concept of `dcat:Distribution`.

Personally, I would be in favor to see the concepts `cdc:DefaultGraphDistribution`, `cdc:NamedGraphDistribtion` and `cdc:graphURI` to be included in DCAT as I believe this can be useful for others who want to indicate the used named/default graphs in distributions as well!

--
GitHub Notification of comment by mathib
Please view or discuss this issue at https://github.com/w3c/dxwg/issues/1241#issuecomment-796847927 using your GitHub account

--
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Thursday, 11 March 2021 16:06:59 UTC