Re: Proposal to close ISSUE-51 as specified in shacl-ref from Holger Knublauch on 2015-08-03 (public-data-shapes-wg@w3.org from August 2015)

From: Holger Knublauch <holger@topquadrant.com>
Date: Mon, 3 Aug 2015 11:05:13 +1000
To: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
Message-ID: <55BEBE49.8000808@topquadrant.com>
Hi Dimitris,

On 7/31/2015 23:38, Dimitris Kontokostas wrote:
> Coming back at this after yesterdays call.
>
> The most common use case that the current approach cannot easily answer is
> "get me all violations results"
> in order to do this we would have to enumerate explicitly all severity 
> levels in the SHACL hierarchy (and possibly the user's extensions), 
> use rdfs reasoning or merge the results in the same dataset with the 
> shacl + user ontology in order to use property paths in the sparql query.

The way that this was designed to work in my draft is that the engine is 
invoked with a minimum severity (see ?minSeverity in 
http://w3c.github.io/data-shapes/shacl/#operation-validateNode). The 
engine does always have full access to the shapes graph, and the shapes 
graph may include user-defined extensions, e.g. subclasses of sh:Info. 
RDFS reasoning is not needed here - similar to our handling of 
subclasses elsewhere it is sufficient to just walk the rdfs:subClassOf 
triples. So I cannot follow your line of reasoning above, as the 
question can be answered already. The SPARQL queries do not need access 
to the result classes - they just return SELECT variable bindings and 
the engine turns them into actual results.

Having said this, my draft is too vague right now and implicitly assumes 
that there is a natural ordering of result classes. In response to this, 
I have added a numeric severity index to each severity level, which can 
be used to determine the ordering. Once this number exists, the 
subClassOf hierarchy becomes less relevant, and rather a mechanism to 
communicate the shape of supported properties.

I have created a branch where we can hopefully finalize the revised 
design before making it a submission:

https://github.com/w3c/data-shapes/tree/ISSUE-51

(Only the turtle file is updated - I didn't want to go too far ahead on 
speculative grounds)

See the rest of this email, I believe I have captured the spirit of your 
proposal, albeit with some minor and syntactical differences (see 
below). Please correct me if I have missed the point.

>
> On the other hand, Hoger's use case about attaching different 
> properties based on the severity level could be possibly handled with 
> shapes, using sh:ConstraintViolation as scope and filtering based on 
> the severity level with sh:hasValue

While I would not object to using shapes here, I think classes already 
provide a very natural way of modeling this. However, now reading about 
your email on ISSUE-75 and the list of SQL error enumerated by Ted in 
the previous call, I can see better why having severity level as a 
separate entity can also have its advantages - especially to better 
distinguish error handling from proper results. Furthermore, although 
the subclasses may in theory add new properties, there is no such 
example in the spec, so the issue is probably not as important as I thought.

>
> I re-propose my suggestion from 
> https://lists.w3.org/Archives/Public/public-data-shapes-wg/2015May/0145.html 
> with renaming based on the current draft
>
> ========================
> #remove sh:ResultClass

sh:ResultClass is now replaced by sh:Severity in my branch.

>
> sh:Result a rdfs:Class # the super class of all results (abstract)
>
> sh:severity a owl:ObjectProperty ; rdfs:domain sh:Result;
> rdfs:range sh:SeverityLevel .
>
> #sh:Result also contains sh:source (maybe sh:detail too)
>
> #Severity definitions
> sh:SeverityLevel a rdfs:Class
> sh:Error a sh:SeverityLevel, owl:NamedIndividual .
> sh:Warn a sh:SeverityLevel, owl:NamedIndividual .

I am very much opposed to adding an (unnecessary) dependency on the OWL 
namespace here. If someone wants to treat the error objects as 
owl:NamedIndividual, then they can add this triple themselves in their 
local copies. But nothing in SHACL requires them to be named OWL 
individuals.

Likewise I don't see why we should use rdfs:domain and rdfs:range here. 
Either we believe SHACL is suitable to communicate a data structure or 
we don't. rdfs:domain and ranges open up implications about inferencing 
that are unnecessary red herrings here. All we want to communicate is 
"sh:Results can have a sh:severity". SHACL is perfectly capable of doing 
this via sh:property. Again, if anyone wants to use this with RDFS 
tools, they can add these domain triples themselves. Or maybe the WG 
wants to produce an alternative version of the SHACL namespace 
especially for backward compatibility with pure RDFS/OWL tools - this 
would be quite easy to produce based on the shape definitions.

>
> #We could attach an integer/float property in sh:SeverityLevel, e.g. 
> sh:severityFactor that could be used for ordering severity levels

The term "severity factor" is used for different purposes elsewhere 
(https://en.wikipedia.org/wiki/Severity_factor) and I guess "factor" 
implies some kind of multiplication. In my draft I am using 
sh:severityIndex for now:

sh:Info  0.0
sh:Warning  1.0
sh:Error 2.0

(all xsd:decimal so that people can place new values in between)

>
> sh:ConstraintViolation a rdfs:Class; # the existing class in the spec
>    rdfs:subClassOf sh:Result .
> # sh:ConstraintViolation contains sh:root, sh:subject, sh:object, ...
> =============================
>
> Notes:
>
> I would also propose a minor renaming that would result in more 
> accurate meaning
> sh:Result -> sh:AbstractResult

Yes renaming to sh:AbstractResult makes sense here.

> sh:ConstraintViolation -> sh:ViolationInstance

For now I picked sh:ValidationResult rdfs:subClassOf sh:AbstractResult. 
I believe the term violation no longer fits the bill because results may 
also include INFO or DEBUG messages. And everything is an "instance"...

Then I added a class sh:Failure to enumerate the various reasons for 
"unexpected" situations such as timeouts:
- sh:IOFailure
- sh:UnsupportedRecursionFailure
- Are there any other identifiable causes?

They are used via sh:failure by the class sh:FailureResult 
rdfs:subClassOf sh:AbstractResult.

Other people may add other result classes such as your accumulated results.

>
> sh:Result / sh:AbstractResult
> This can be used in case someone wants to provide alternative results 
> for SHACL and means that the minimum information one should have is a 
> severity level and a link to the source (shape/facet/...) this result 
> came from

Another small change I did was to rename sh:source to 
sh:sourceConstraint and to add sh:sourceShape. The reason for this is 
that multiple shapes may share the same sh:Constraint, so we want to 
remember the context (if we can).

Cheers,
Holger


>
> Best,
> Dimitris
>
> On Thu, Jul 30, 2015 at 4:54 PM, Dimitris Kontokostas 
> <kontokostas@informatik.uni-leipzig.de 
> <mailto:kontokostas@informatik.uni-leipzig.de>> wrote:
>
>     I have the following comments on the result vocabulary
>     - I would still argue that sh:Error, sh:Warn, etc should be
>     separate property (i.e. sh:level) and not mixed with the result type.
>     - I would also like to have a result superclass so that people can
>     extend the results to other formats.
>
>     Besides these comments, I am fine with the current design.
>
>     Best,
>     Dimitris
>
>     On Tue, Jul 28, 2015 at 2:57 AM, Holger Knublauch
>     <holger@topquadrant.com <mailto:holger@topquadrant.com>> wrote:
>
>         The current SHACL Reference draft defines a results vocabulary:
>
>         http://w3c.github.io/data-shapes/shacl-ref/#results-vocabulary
>
>         My proposal is to close ISSUE-51 [1] adopting this design (for
>         now). It's easy to modify at a later stage based on experience.
>
>         Holger
>
>         [1] http://www.w3.org/2014/data-shapes/track/issues/51
>
>
>
>
>     -- 
>     Dimitris Kontokostas
>     Department of Computer Science, University of Leipzig & DBpedia
>     Association
>     Projects: http://dbpedia.org, http://http://aligned-project.eu,
>     http://rdfunit.aksw.org
>     Homepage:http://aksw.org/DimitrisKontokostas
>     Research Group: http://aksw.org
>
>
>
>
> -- 
> Dimitris Kontokostas
> Department of Computer Science, University of Leipzig & DBpedia 
> Association
> Projects: http://dbpedia.org, http://http://aligned-project.eu, 
> http://rdfunit.aksw.org
> Homepage:http://aksw.org/DimitrisKontokostas
> Research Group: http://aksw.org
>
Received on Monday, 3 August 2015 01:05:50 UTC