Re: Anyone in support of CONSTRUCT constraints? from Dimitris Kontokostas on 2015-03-27 (public-data-shapes-wg@w3.org from March 2015)

From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
Date: Fri, 27 Mar 2015 17:16:03 +0200
To: Richard Cyganiak <richard@cyganiak.de>
Cc: Holger Knublauch <holger@topquadrant.com>, "public-data-shapes-wg@w3.org" <public-data-shapes-wg@w3.org>
Message-ID: <CA+u4+a0QO+v8wwt4LVjfdMY8F19VrpV0fhm+nRyCQVWLnPLoOg@mail.gmail.com>
On Fri, Mar 27, 2015 at 12:57 PM, Richard Cyganiak <richard@cyganiak.de>
wrote:

> To be honest, I’d prefer a simpler design where a violation is represented
> as a set of key-value pairs, with some keys such as “root” and “path”
> having special meaning.
>
> The current design uses a Constraint Violation Vocabulary to represent
> them as RDF graphs. That seems unnecessarily complicated to me.
>
> In many use cases, a processor would immediately turn the violation RDF
> graph back into key-value pairs via a SPARQL SELECT query against the
> violation RDF graph.
>
> Further advantages of a design where a violation representation consists
> of key-value pairs:
>

I think the results of an RDF validation should have a canonical RDF
representation. If this is the base we can transform the resulting RDF
graph into any form we want, e.g. html for users or csv as you suggest but
'd prefer not to use csv as the base.


> - For constraints expressed as SPARQL, the each result in the SELECT
> evaluation is already a set of key-value pairs, so the SELECT query would
> directly (or with minimal post-processing along the lines of the rules in
> 13.1.2) return violations in the expected representation.

- A mechanism like “Injecting Annotation Properties into Constraint
> Violations” would be unnecessary, as the author of the constraint could
> simply return some extra variables in the SELECT query.
>

In this case, what would be the interpretation of these additional
variables if not bound to a property?


> - Test cases might become simpler, as comparing SELECT results is easier
> than comparing RDF graphs.
>

I agree but should this be a reason for limiting the result expressiveness?

I also suggested to allow different types of results, each one for a
different audience.
https://www.w3.org/2014/data-shapes/wiki/Requirements#Constraint_Violations_Reporting_Details

The current one we define is the richest one in terms of metadata but this
is not always required.
Maybe we can define an additional one where we report only the resource
(sh:root) and a message (sh:message) without the path or any other
annotation metadata
Would this limit your concerns?

An approach for this is to have  something like the following
sh:ViolationResults a superclass
sh:ViolationResources,  subClassOf sh:ViolationResults and allowed to
contain only sh:root and sh:message
sh:ViolationResourcesEnriched, subClassOf sh:ViolationResources and allowed
to contain sh:path and everything else

When the user runs the evaluation he can request either of the two formats
as his result

Best,
Dimtiris


>
> Richard
>
>
>
> > On 27 Mar 2015, at 01:08, Holger Knublauch <holger@topquadrant.com>
> wrote:
> >
> > On 3/26/2015 18:43, Dimitris Kontokostas wrote:
> >> Hi Holger,
> >>
> >> I would like to add an additional way to enrich the results of a SPARQL
> query.
> >> Examples are in [1,2] where below a SPARQL query we can request /
> inject additional data in the results.
> >> In RDFUnit I allow only the variable ?resource in the SELECT query (not
> exactly but sort of) so most cases are already handled by the additional
> variables you introduced in SHACL.
> >> However this approach can compensate some of the expressiveness we
> loose from CONSTRUCT and can add additional metadata in the results e.g.
> what is missing in [3] or anything the user wants.
> >
> > Thanks, this makes sense to me. It provides flexibility for example to
> have constraint violations that hint at a fix suggestion and provides a
> natural extension point for features that are not yet standardized - who
> can anticipate what people will use SHACL for!
> >
> > I have included a draft for this feature:
> >
> > http://w3c.github.io/data-shapes/shacl/#sparql-constraints-annotations
> >
> > Please let me know what you think. We can obviously marry this with a
> string templating mechanism later, once have reached that part :)
> >
> >>
> >> Regarding one of the problems in SELECT queries you mentioned
> >> `-multiple result values in the same sh:Error (e.g. multiple sh:value)`
> >> I think SELECT gives us more freedom to do what we want. In RDFUnit I
> group multiple values in the same violation  (along with all other
> requested metadata) by post-processing the results.
> >> I am not sure if we all agree if multiple sh:value should be in the
> same error or not (I think they should be grouped) but we can specify the
> behavior we want later on.
> >
> > My current draft assumes that there is at most one sh:path and at most
> one sh:value per violation. This assumes simplicity, which may not be
> sufficient on the long term, yet simplify adoption by tools that display
> those violations. For example if someone double-clicks on a constraint
> violation, the system would focus exactly one input field. As you say we
> can change this later, once we are more confident about that trade-off.
> >
> > Thanks,
> > Holger
> >
> > Diff:
> https://github.com/w3c/data-shapes/commit/15623ec9a160e9c7e190ac9416a26fa0a3eb6229
>
>
>
>


-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig
Research Group: http://aksw.org
Homepage:http://aksw.org/DimitrisKontokostas
Received on Friday, 27 March 2015 15:16:58 UTC