Re: Proposed requirement: 2.10.4 Constraint Violations Reporting Details

Thanks Dimitris, that makes sense.

So the proposed requirement could be something like: “Number of violations should be countable when validating an RDF graph”.

I’m not sure if you have contributed a user story already, to back up this requirement? A brief description of what you do with regard to validation in DBpedia?

Best,
Richard


> On 26 Feb 2015, at 19:28, Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de> wrote:
> 
> Hi Richard,
> 
> There are some cases with the current draft spec where counting is not easy unless we require some conventions
> e.g. for sections 15.1.1 CONSTRUCT-based Constraints [1] and 15.1.2 ASK-based Constraints [2] how would you count the number of violations?
> or when for a resource there are two violation values (sh:value) should the values create separate violations or grouped in the same resource?
> 
> I already briefly pointed this to Holger. 
> All we need to change is to require the existence of one SPARQL variable (sh:root or sh:this) in every query
> 
> Best,
> Dimitris
> 
> [1] http://w3c.github.io/data-shapes/data-shapes-core/#sparql-constraints-construct
> [2] http://w3c.github.io/data-shapes/data-shapes-core/#sparql-constraints-ask
>  
> 
> On Thu, Feb 26, 2015 at 8:09 PM, Richard Cyganiak <richard@cyganiak.de> wrote:
> Hi Dimitris,
> 
> I’m not sure I understand what requirement you’re proposing.
> 
> Are you proposing that SHACL should not include detailed violation reporting facilities, because there could be too many reports?
> 
> Counting violations seems like something that implementations can do no matter how SHACL is designed, so doesn’t appear to give rise to any particular requirement for the language itself?
> 
> Richard
> 
> 
> 
> 
> > On 26 Feb 2015, at 14:30, Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de> wrote:
> >
> > Dear all,
> >
> > I proposed the following requirement that derived from UC34
> > https://www.w3.org/2014/data-shapes/wiki/Requirements#Constraint_Violations_Reporting_Details
> >
> > In large databases (such as DBpedia) there can be many thousands of violations and getting the detailed nodes that failed is not practical.
> > In these cases, getting the number of violations per shape / shape facet is more suited. Most of the times all the violations of a shape facet can be amended with a single code/mapping fix
> >
> > In the following example we had ~1M violations related to geo from four constraints & another ~1M violations for images that both got fixed with a single commit in the code
> >
> > http://nl.dbpedia.org/downloads/rdfunit/20141210/
> > *.aggregated* groups constraints with error counts & prevalence
> > *.rlog* displays only 10 violation nodes per constraint
> >
> > --
> > Dimitris Kontokostas
> > Department of Computer Science, University of Leipzig
> > Research Group: http://aksw.org
> > Homepage:http://aksw.org/DimitrisKontokostas
> 
> 
> 
> 
> 
> -- 
> Dimitris Kontokostas
> Department of Computer Science, University of Leipzig
> Research Group: http://aksw.org
> Homepage:http://aksw.org/DimitrisKontokostas

Received on Thursday, 26 February 2015 20:07:33 UTC