W3C home > Mailing lists > Public > public-data-shapes-wg@w3.org > March 2016

Re: shapes-ISSUE-130 (rdf dataset assumption): SHACL should not assume that the data graph is in an RDF dataset [SHACL Spec]

From: Holger Knublauch <holger@topquadrant.com>
Date: Fri, 18 Mar 2016 09:41:52 +1000
To: Tom Johnson <johnson.tom@gmail.com>
Cc: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>, "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
Message-ID: <56EB40C0.6000701@topquadrant.com>
On 18/03/2016 9:23, Tom Johnson wrote:
> This brings us back to a week ago up-thread: 
> https://lists.w3.org/Archives/Public/public-data-shapes-wg/2016Mar/0082.html 
>
>
> Per this comment, I don't think this is a purely conceptual problem. 
> If a dataset is a Fundamental concept that can't be ignored in 
> defining SHACL, no problem. I'm somewhat skeptical of this, but I 
> agree with all your points that this isn't a hard implementation, it's 
> easily supported in all major existing RDF environments, etc...
>
> The problem I see exists closer to the SHACL language. When authoring 
> a shapes graph, how can I ensure that I will be able to run it against 
> the data graphs of my choice?

I don't see the connection. Shapes graphs are typically authored 
independently of any specific data graph. The SHACL engine is invoked 
with certain parameters. These would be
- dataset
- IRI of data graph (or empty) for default graph in the dataset
- shapes graph (hopefully in the same dataset)

How these parameters are collected before the engine is invoked is 
outside of the spec. Some applications may look at sh:shapesGraph 
triples in the data graph, others may have the shapes graph hard-coded.

Holger



>
> - Tom
>
> On Thu, Mar 17, 2016 at 4:16 PM, Holger Knublauch 
> <holger@topquadrant.com <mailto:holger@topquadrant.com>> wrote:
>
>     A dataset is a conceptual entity that can be implemented in many
>     ways. One implementation may retrieve certain graphs on-the-fly
>     via an LDP server. That's an implementation detail to me. But
>     SPARQL requires datasets, so how can we pretend they don't exist?
>
>     Holger
>
>
>
>     On 18/03/2016 9:06, Tom Johnson wrote:
>>     Datasets seem fine to me, as an implementation detail, but my
>>     main context for using SHACL would be to validate graphs from one
>>     of two contexts:
>>
>>       - Linked Data Platform RDFSources; i.e. I have an existing
>>     shapes graph, and I want to validate a number of graphs grabbed
>>     via HTTP GET.
>>       - Existing in-memory graphs. (in this case, serializing to a
>>     file is always an option).
>>
>>     In either of these cases specifying the Dataset construct at the
>>     SHACL level seems unwieldy. When writing a shapes graph, I need
>>     to know how the specific engine will handle my
>>     URI/File/native-graph input.
>>
>>     I don't have a specific proposal to improve things, but the idea
>>     that SHACL's behavior is undefined except when my shapes graph
>>     and data graph already exist within the same SPARQL Dataset is
>>     concerning.
>>
>>     - Tom
>>
>>     On Thu, Mar 17, 2016 at 3:49 PM, Holger Knublauch
>>     <holger@topquadrant.com <mailto:holger@topquadrant.com>> wrote:
>>
>>         On 18/03/2016 1:49, Dimitris Kontokostas wrote:
>>>
>>>
>>>         On Thu, Mar 17, 2016 at 3:20 PM, Peter F. Patel-Schneider
>>>         <pfpschneider@gmail.com <mailto:pfpschneider@gmail.com>> wrote:
>>>
>>>             The diff seems to indicate that functions still work
>>>             only over datasets,
>>>             although there is a TODO indicated. 
>>>
>>>
>>>         I was waiting for Holger to wake up and clarify this but you
>>>         are right
>>
>>         SPARQL in general relies on datasets, e.g. in the GRAPH
>>         keyword. As soon as we talk about arbitrary SPARQL queries
>>         (e.g. in sh:sparql constraints or functions), datasets need
>>         to be in the picture.
>>
>>         I maintain my position that we would be making our lives
>>         easier if we were simply talking about datasets,
>>         acknowledging that these datasets may only exist for the
>>         duration of a SHACL execution and contain the data graph (and
>>         the shapes graph) only. There is nothing conceptually
>>         difficult here, neither difficult to implement.
>>
>>         Holger
>>
>>
>>
>>>             Two "or dataset"s were also removed -
>>>             making the dataset optional is probably benign.
>>>
>>>             As far as $shapesGraph goes, wording along the lines of
>>>             "can be used to access
>>>             the shapes graph" would seem to work, but would need
>>>             some explanation along
>>>             the lines of "If the shapes graph is a named graph in
>>>             the same dataset as the
>>>             data graph then it can be accessed using its name in the
>>>             dataset.  Otherwise a
>>>             SHACL engine would need to provide an alternative way to
>>>             access the shapes graph."
>>>
>>>
>>>         tried to fix all your comments in a second commit here
>>>         https://github.com/w3c/data-shapes/commit/a0a52a4472d807683abf7cf104079f35272f27df
>>>
>>>
>>>         this is a merge of both commits but also includes a few
>>>         other editorial fixes
>>>         https://github.com/w3c/data-shapes/compare/editorial-dk
>>>
>>>         apologies for sending commits but I am not yet comfortable
>>>         to commit to the main branch
>>>
>>>         Best,
>>>         Dimitris
>>>
>>>
>>>
>>>
>>>
>>>             peter
>>>
>>>
>>>             On 03/17/2016 02:04 AM, Dimitris Kontokostas wrote:
>>>             > I removed all mentions of dataset in the text that
>>>             imply that validation works
>>>             > on RDF datasets
>>>             >
>>>             https://github.com/w3c/data-shapes/commit/fa2d99fd61a473d14eef9cee57c0db1c61e03684
>>>             >
>>>             > One thing left to decide and close this issue is how
>>>             to refer to the
>>>             > '$shapesGraph" variable. right now we have variations
>>>             of the following in the spec
>>>             > "$shapesGraph ... a named graph IRI that contains the
>>>             the shapes graph"
>>>             >
>>>             > reading the sparql spec, named graphs imply an RDF
>>>             dataset:
>>>             > https://www.w3.org/TR/sparql11-query/#namedGraphs
>>>             >
>>>             > even if we remove the "named" and refer to them as
>>>             just graphs, SPARQL uses
>>>             > the "GRAPH" keyword in the "Querying the Dataset" section
>>>             > https://www.w3.org/TR/sparql11-query/#queryDataset
>>>             >
>>>             > so, do we still have an implicit assumption that
>>>             validation works on RDF datasets?
>>>             >
>>>             > Since we already have a resolution on $shapesGraph I
>>>             see the following options:
>>>             > a) we accept the edits as they are now in the spec and
>>>             close this issue
>>>             > b) we try to weaken further the connection and change
>>>             occurrences of "named
>>>             > graph" to "graph"
>>>             > c) since this is sparql-specific issue, we can add a
>>>             disclaimer in section 1.2
>>>             >
>>>             > regarding how a SHACL validation engine wraps graphs
>>>             to perform validation, it
>>>             > can be an implementation detail
>>>             >
>>>             > Dimitris
>>>             >
>>>             >
>>>             > On Wed, Mar 9, 2016 at 9:35 AM, Holger Knublauch
>>>             <holger@topquadrant.com <mailto:holger@topquadrant.com>
>>>             > <mailto:holger@topquadrant.com <mailto:holger@topquadrant.com>>>
>>>             wrote:
>>>             >
>>>             >     Yes, you may have a point there - in cases like
>>>             the default graph we need
>>>             >     to make sure that the system knows which subject
>>>             to look for the
>>>             >     sh:shapesGraph triples. This is probably just a
>>>             URI parameter.
>>>             >
>>>             >     (There are so many edit suggestions open right now
>>>             that I am looking
>>>             >     forward to sharing the workload with a second
>>>             editor, now that Arthur has
>>>             >     left; yes I have a day job too.)
>>>             >
>>>             >     Holger
>>>             >
>>>             >
>>>             >     On 8/03/2016 11:02, Tom Johnson wrote:
>>>             >>     > An RDF dataset is a purely conceptual entity.
>>>             Many APIs implement
>>>             >>     Dataset. Any Graph can be wrapped into a Dataset
>>>             for execution, even if
>>>             >>     that Dataset is just virtual and only has a
>>>             single graph in it.
>>>             >>
>>>             >>     Reading the quoted text, this doesn't seem to
>>>             hold. The "data graph"
>>>             >>     links to the "shapes graph" via a triple with its
>>>             graph name as the
>>>             >>     subject. Many graphs do not have such a name
>>>             (even those that are within
>>>             >>     Datasets; i.e. default graphs).
>>>             >>
>>>             >>     Does SHACL provide a mechanism for connecting
>>>             such a graph to a shapes
>>>             >>     graph? If not, how does wrapping the graph in a
>>>             dataset within the
>>>             >>     implementation help a SHACL user make this
>>>             connection?
>>>             >>
>>>             >>     - Tom
>>>             >>
>>>             >>     On Mon, Mar 7, 2016 at 3:00 PM, Holger Knublauch
>>>             <holger@topquadrant.com <mailto:holger@topquadrant.com>
>>>             >>  <mailto:holger@topquadrant.com
>>>             <mailto:holger@topquadrant.com>>> wrote:
>>>             >>
>>>             >>         An RDF dataset is a purely conceptual entity.
>>>             Many APIs implement
>>>             >>         Dataset. Any Graph can be wrapped into a
>>>             Dataset for execution, even
>>>             >>         if that Dataset is just virtual and only has
>>>             a single graph in it.
>>>             >>
>>>             >>         Holger
>>>             >>
>>>             >>
>>>             >>
>>>             >>         On 8/03/2016 2:45, RDF Data Shapes Working
>>>             Group Issue Tracker wrote:
>>>             >>
>>>             >>  shapes-ISSUE-130 (rdf dataset assumption): SHACL
>>>             should not
>>>             >>  assume that the data graph is in an RDF dataset
>>>             [SHACL Spec]
>>>             >>
>>>             >> http://www.w3.org/2014/data-shapes/track/issues/130
>>>             >>
>>>             >>  Raised by: Peter Patel-Schneider
>>>             >>             On product: SHACL Spec
>>>             >>
>>>             >>             """4. Declaring the Shapes Graph
>>>             >>
>>>             >>             A data graph MAY link to one or more
>>>             shapes graphs via the
>>>             >>  property sh:shapesGraph. The subject of this
>>>             predicate must be
>>>             >>             the graph resource, i.e. the name of the
>>>             data graph in the
>>>             >>  dataset. The objects of this predicate must be IRI
>>>             nodes,
>>>             >>  pointing at a named graph in the dataset. Tools may
>>>             use this
>>>             >>  information to determine which shapes graph to use for
>>>             >>  validation. If present, tools SHOULD transitively
>>>             follow any
>>>             >>             links from the shapes graph via the
>>>             predicate owl:imports to
>>>             >>             other graphs and use the resulting union
>>>             graph as parameter to
>>>             >>             the validation process."""
>>>             >>
>>>             >>             This assumes that the data graph is in an
>>>             RDF dataset.  SHACL
>>>             >>  validation should work on data graphs that are not
>>>             in an RDF
>>>             >>  dataset.
>>>             >>
>>>             >>
>>>             >>
>>>             >>
>>>             >>
>>>             >>
>>>             >>
>>>             >>
>>>             >>
>>>             >>
>>>             >>
>>>             >>     --
>>>             >>     -Tom Johnson
>>>             >
>>>             >
>>>             >
>>>             >
>>>             > --
>>>             > Dimitris Kontokostas
>>>             > Department of Computer Science, University of Leipzig
>>>             & DBpedia Association
>>>             > Projects: http://dbpedia.org, http://rdfunit.aksw.org,
>>>             > http://http://aligned-project.eu
>>>             <http://aligned-project.eu/>
>>>             > Homepage:http://aksw.org/DimitrisKontokostas
>>>             > Research Group: AKSW/KILT http://aksw.org/Groups/KILT
>>>             >
>>>
>>>
>>>
>>>
>>>         -- 
>>>         Dimitris Kontokostas
>>>         Department of Computer Science, University of Leipzig &
>>>         DBpedia Association
>>>         Projects: http://dbpedia.org, http://rdfunit.aksw.org,
>>>         http://http://aligned-project.eu
>>>         Homepage:http://aksw.org/DimitrisKontokostas
>>>         Research Group: AKSW/KILT http://aksw.org/Groups/KILT
>>>
>>
>>
>>
>>
>>     -- 
>>     -Tom Johnson
>
>
>
>
> -- 
> -Tom Johnson
Received on Thursday, 17 March 2016 23:42:29 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:30:30 UTC