Re: Comments on Draft #2 from Dimitris Kontokostas on 2015-03-18 (public-data-shapes-wg@w3.org from March 2015)

From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
Date: Wed, 18 Mar 2015 10:01:46 +0200
To: Holger Knublauch <holger@topquadrant.com>
Cc: public-data-shapes-wg <public-data-shapes-wg@w3.org>
Message-ID: <CA+u4+a1=SimLS95wtAr6icj6WKX6kGBCXpVkDZ9kcsZXa3NZ0w@mail.gmail.com>
On Wed, Mar 18, 2015 at 2:05 AM, Holger Knublauch <holger@topquadrant.com>
wrote:

> On 3/18/2015 3:03, Karen Coyle wrote:
>
>>
>> *** Document ***
>>
>> 2.1
>> has note:
>> "sh:Info has also been suggested, but this would work best if there was a
>> deterministic mechanism to identify constraints that need to be checked, so
>> that Info constraints can be bypassed. Related to this, Dimitris also
>> suggested we introduce sh:ValidationResult as a superclass of
>> sh:ConstraintViolation."
>>
>> *** kc ***
>>
>> I don't understand about by-passing Info constraints. If a constraint is
>> included in a SHACL document, is it not important enough to be checked,
>> regardless of the level of the error? If a condition is not considered
>> important that constraint should not be included in the particular
>> validation application.
>>
>> Dimitris says: "Users can optionally execute a validation requiring the
>> reporting of a minimum security level (i.e. Error). In that case the
>> execution engine will skip the execution of all shapes or shape properties
>> that have a weaker security level than the one requested at the execution
>> time" [2]
>>
>> While this sounds like a good idea, it does require there to be an agreed
>> ranking of errors that are included in SHACL, or a way to customize that
>> ranking, and a way to inter-rank any local sub-classes of
>> sh:ValidationResult. Because of how differently people might see the
>> various errors, I'm skeptical that this ranking would work.
>>
>
> I believe the spec has this covered: it defines the two top-level
> validation result classes sh:Warning and sh:Error, and anyone who wants to
> customize this ranking must extend those two classes if they want to be in
> the corresponding categories. Anyone could add Info too, but those would be
> neither warnings nor errors. The question is whether we want to include
> Info level by default, and I have no strong opinion either way. Most
> logging systems from programming languages have such a capability, yet I am
> unclear about use cases in the context of SHACL.
>

Hi Karen, Holger,

The rational for having a superclass is to support different types
validation results. What we have in SHACL right now is more or less the
ExtendedTestCaseResult I produce in RDFUnit (which I rarely use btw).
A complete result hierarchy of RDFUnit can be seen in the lower part of the
following diagram: http://rdfunit.aksw.org/ns/rdfunit_ontology_diagram.png
and some examples in http://nl.dbpedia.org/downloads/rdfunit/20141210/

I also proposed a requirement for this
https://www.w3.org/2014/data-shapes/wiki/Requirements#Constraint_Violations_Reporting_Details

Even if we do not approve the requirement or we do not provide other types
of violation results in the first SHACL rec I think it is a good idea to
have a superclass and allow other people to create subclasses of their own.
This costs us nothing and makes SHACL more flexible

Other than that I think we should allow more severity levels. In my opinion
we should include the standard programming logging levels with a predefined
priority e.g. fatal, error, warn, info, debug trace.
And in a similar way we do with programming we can enable the reporting of
a level at runtime when we run the validation, maybe with warn as default.
The simplest approach would be to validate against all shapes but report
only the ones above the desired level but I am sure we can optimize and
skip the ones we are not interested in for better performance.
@karen There is no need to inter-rank anything, the severity level will be
marked directly in the shape and everyone will know in advance what
severity this shape is meant to report.
This approach give us the flexibility to have complete set of shapes for
all types/levels of violation and depending on the context or who performs
the validation some might be skipped.

Besides that, I think that hardcoding the severity levels as direct classes
is also not a good approach. this might work fine for this particular
result type but does not fit well if we want to allow other types of
results.  I'd suggest we make severity level a property that is be attached
to the superclass.

Best,
Dimitris



Thanks,
> Holger
>
>
>


-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig
Research Group: http://aksw.org
Homepage:http://aksw.org/DimitrisKontokostas
Received on Wednesday, 18 March 2015 08:02:46 UTC