- From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
- Date: Mon, 3 Aug 2015 14:01:35 +0300
- To: Holger Knublauch <holger@topquadrant.com>
- Cc: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
- Message-ID: <CA+u4+a3m8JjR2v5x1YWQd3_S60UTDKLDRqtF2fUKEV-Xbg0ffQ@mail.gmail.com>
On Mon, Aug 3, 2015 at 4:05 AM, Holger Knublauch <holger@topquadrant.com> wrote: > Hi Dimitris, > > On 7/31/2015 23:38, Dimitris Kontokostas wrote: > > Coming back at this after yesterdays call. > > The most common use case that the current approach cannot easily answer is > "get me all violations results" > in order to do this we would have to enumerate explicitly all severity > levels in the SHACL hierarchy (and possibly the user's extensions), use > rdfs reasoning or merge the results in the same dataset with the shacl + > user ontology in order to use property paths in the sparql query. > > > The way that this was designed to work in my draft is that the engine is > invoked with a minimum severity (see ?minSeverity in > http://w3c.github.io/data-shapes/shacl/#operation-validateNode). The > engine does always have full access to the shapes graph, and the shapes > graph may include user-defined extensions, e.g. subclasses of sh:Info. RDFS > reasoning is not needed here - similar to our handling of subclasses > elsewhere it is sufficient to just walk the rdfs:subClassOf triples. So I > cannot follow your line of reasoning above, as the question can be answered > already. The SPARQL queries do not need access to the result classes - they > just return SELECT variable bindings and the engine turns them into actual > results. > > Having said this, my draft is too vague right now and implicitly assumes > that there is a natural ordering of result classes. In response to this, I > have added a numeric severity index to each severity level, which can be > used to determine the ordering. Once this number exists, the subClassOf > hierarchy becomes less relevant, and rather a mechanism to communicate the > shape of supported properties. > Note that independent if we allow access to the shapes graph during validation I don't think we should also require access the shapes graph / shacl ontology when one reads the validation results. Thus a very simple query such as get me all results gets quite complex especially when results are stored offline and processed later. note that I agree that access to the shapes graph / shacl ontology can be handy for more advanced processing but we shouldn't have this dependency for simple selections > I have created a branch where we can hopefully finalize the revised design > before making it a submission: > > https://github.com/w3c/data-shapes/tree/ISSUE-51 > > (Only the turtle file is updated - I didn't want to go too far ahead on > speculative grounds) > > See the rest of this email, I believe I have captured the spirit of your > proposal, albeit with some minor and syntactical differences (see below). > Please correct me if I have missed the point. > Looks good to me, I suggest we move the sh:root, sh:subject, sh:predicate, sh:object to sh:ValidationResult class and see some minor comments inline > > On the other hand, Hoger's use case about attaching different properties > based on the severity level could be possibly handled with shapes, > using sh:ConstraintViolation as scope and filtering based on the severity > level with sh:hasValue > > > While I would not object to using shapes here, I think classes already > provide a very natural way of modeling this. However, now reading about > your email on ISSUE-75 and the list of SQL error enumerated by Ted in the > previous call, I can see better why having severity level as a separate > entity can also have its advantages - especially to better distinguish > error handling from proper results. Furthermore, although the subclasses > may in theory add new properties, there is no such example in the spec, so > the issue is probably not as important as I thought. > > > I re-propose my suggestion from > https://lists.w3.org/Archives/Public/public-data-shapes-wg/2015May/0145.html > with renaming based on the current draft > > ======================== > #remove sh:ResultClass > > > sh:ResultClass is now replaced by sh:Severity in my branch. > > > sh:Result a rdfs:Class # the super class of all results (abstract) > > sh:severity a owl:ObjectProperty ; rdfs:domain sh:Result; > rdfs:range sh:SeverityLevel . > > #sh:Result also contains sh:source (maybe sh:detail too) > > #Severity definitions > sh:SeverityLevel a rdfs:Class > sh:Error a sh:SeverityLevel, owl:NamedIndividual . > sh:Warn a sh:SeverityLevel, owl:NamedIndividual . > > > I am very much opposed to adding an (unnecessary) dependency on the OWL > namespace here. If someone wants to treat the error objects as > owl:NamedIndividual, then they can add this triple themselves in their > local copies. But nothing in SHACL requires them to be named OWL > individuals. > I used owl as it is a simpler / shorter way of defining the proposed resolution, I am open to use shapes directly as you do in your draft > Likewise I don't see why we should use rdfs:domain and rdfs:range here. > Either we believe SHACL is suitable to communicate a data structure or we > don't. rdfs:domain and ranges open up implications about inferencing that > are unnecessary red herrings here. All we want to communicate is > "sh:Results can have a sh:severity". SHACL is perfectly capable of doing > this via sh:property. Again, if anyone wants to use this with RDFS tools, > they can add these domain triples themselves. Or maybe the WG wants to > produce an alternative version of the SHACL namespace especially for > backward compatibility with pure RDFS/OWL tools - this would be quite easy > to produce based on the shape definitions. > > > #We could attach an integer/float property in sh:SeverityLevel, e.g. > sh:severityFactor that could be used for ordering severity levels > > > The term "severity factor" is used for different purposes elsewhere ( > https://en.wikipedia.org/wiki/Severity_factor) and I guess "factor" > implies some kind of multiplication. In my draft I am using > sh:severityIndex for now: > > sh:Info 0.0 > sh:Warning 1.0 > sh:Error 2.0 > > (all xsd:decimal so that people can place new values in between) > I thing we should start with 10.0 for sh:Info with a 10.0 step between each value in order to give room for others to define intermediate levels easier > sh:ConstraintViolation a rdfs:Class; # the existing class in the spec > rdfs:subClassOf sh:Result . > # sh:ConstraintViolation contains sh:root, sh:subject, sh:object, ... > ============================= > > Notes: > > I would also propose a minor renaming that would result in more accurate > meaning > sh:Result -> sh:AbstractResult > > > Yes renaming to sh:AbstractResult makes sense here. > > sh:ConstraintViolation -> sh:ViolationInstance > > > For now I picked sh:ValidationResult rdfs:subClassOf sh:AbstractResult. I > believe the term violation no longer fits the bill because results may also > include INFO or DEBUG messages. And everything is an "instance"... > > Then I added a class sh:Failure to enumerate the various reasons for > "unexpected" situations such as timeouts: > - sh:IOFailure > - sh:UnsupportedRecursionFailure > - Are there any other identifiable causes? > what about syntax error or timeout? Q: For syntax errors (esp for sparql) do we report them during loading or during execution? Thanks, Dimitris > They are used via sh:failure by the class sh:FailureResult rdfs:subClassOf > sh:AbstractResult. > > Other people may add other result classes such as your accumulated results. > > > sh:Result / sh:AbstractResult > This can be used in case someone wants to provide alternative results for > SHACL and means that the minimum information one should have is a severity > level and a link to the source (shape/facet/...) this result came from > > > Another small change I did was to rename sh:source to sh:sourceConstraint > and to add sh:sourceShape. The reason for this is that multiple shapes may > share the same sh:Constraint, so we want to remember the context (if we > can). > > Cheers, > Holger > > -- Dimitris Kontokostas Department of Computer Science, University of Leipzig & DBpedia Association Projects: http://dbpedia.org, http://http://aligned-project.eu, http://rdfunit.aksw.org Homepage:http://aksw.org/DimitrisKontokostas Research Group: http://aksw.org
Received on Monday, 3 August 2015 11:02:33 UTC