W3C home > Mailing lists > Public > public-data-shapes-wg@w3.org > February 2015

Re: Proposed requirement: 2.10.4 Constraint Violations Reporting Details

From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
Date: Fri, 27 Feb 2015 10:24:29 +0200
Message-ID: <CA+u4+a0KFA+U0itsp20YkWz=k2=kOu165DkT6c9p=f9iMJQC4g@mail.gmail.com>
To: Holger Knublauch <holger@topquadrant.com>
Cc: public-data-shapes-wg <public-data-shapes-wg@w3.org>
@Richard, the user story S34 motivates this requirement, if you think this
requirement is not covered I could write an additional one or extend the
existing one

@Holger, see inlive

On Fri, Feb 27, 2015 at 12:57 AM, Holger Knublauch <holger@topquadrant.com>

> On 2/27/2015 5:28, Dimitris Kontokostas wrote:
>> Hi Richard,
>> There are some cases with the current draft spec where counting is not
>> easy unless we require some conventions
>> e.g. for sections 15.1.1 CONSTRUCT-based Constraints [1] and 15.1.2
>> ASK-based Constraints [2] how would you count the number of violations?
>> or when for a resource there are two violation values (sh:value) should
>> the values create separate violations or grouped in the same resource?
>> I already briefly pointed this to Holger.
> Yes thanks, and we should continue that discussion here on the public
> list. We need to look at specific example constraint definitions. I believe
> it should already be possible to rewrite any of the current SPARQL queries
> to repurpose their WHERE clause (e.g. you can use SELECT (?x AS ?value) to
> learn that ?x is the variable mapped to sh:value). Another idea is to let
> it run normally and then instantly post-process the newly created
> constraint violations, e.g. by grouping similar violations into one.
> Engines can do all this by themselves, but maybe we need some flag
> properties to make sure that certain aggregation of results is done on the
> fly. (Of course nothing hinders users to already create aggregations in
> their own WHERE clauses, so maybe what you need is already expressible and
> it becomes a matter of syntactic sugar).

I think the hint is required, otherwise, in my case I would have to
manually retrieve 2M records (with pagination), count them, and then
provide the number in the results. With the hint I could just execute
"select count(distinct ?hint_var). The difference in execution time can be
huge at this scale
I sent a separate feedback email that relates to this.

> Anyway, in my role as an editor I very much welcome such suggestions and
> will put TODO snippets into the spec to make sure that open
> issues/suggestions are communicated properly.
> Holger

Dimitris Kontokostas
Department of Computer Science, University of Leipzig
Research Group: http://aksw.org
Received on Friday, 27 February 2015 08:25:23 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:30:14 UTC