Re: shapes-ISSUE-181: SHACL conformance for partial validation reports [SHACL Spec] from Holger Knublauch on 2016-09-30 (public-data-shapes-wg@w3.org from September 2016)

From: Holger Knublauch <holger@topquadrant.com>
Date: Fri, 30 Sep 2016 14:02:04 +1000
To: public-data-shapes-wg@w3.org
Message-ID: <9f17be13-28c5-6820-3287-b52a5d7c11b1@topquadrant.com>
On 30/09/2016 13:56, Karen Coyle wrote:
>
>
> On 9/29/16 8:41 PM, Holger Knublauch wrote:
>> I did further edits to this section in response to ISSUE-181 and
>> ISSUE-182, see
>>
>> https://github.com/w3c/data-shapes/commit/199c39ad59ccc3faf92746102a035cff91ab8305 
>>
>
> Thanks for clarifying the "focus nodes".
>
> I believe that the way it now reads, even if there is an sh:message in 
> the shapes graph, there is no requirement to include it in the 
> validation report?

See my previous email, sh:message is currently not used in the shapes 
graph, except in SPARQL-based constraints and constraint components. See

https://www.w3.org/2014/data-shapes/track/issues/178

>
> Also, in the first sentence:
> "The definitions for validating a data graph against a shapes graph as 
> well as a focus node from the data graph against a shape from the 
> shapes graph are provided below:"
>
> ... is the focus node validated against a shape, or against a 
> constraint? I think we've said that the shape includes targets and 
> constraints, and that the validation of the focus node is against the 
> constraints.

It's validated against a shape because the shape may also define filter 
shapes.

Holger


>
> kc
>
>>
>> On 30/09/2016 13:08, Karen Coyle wrote:
>>>
>>>
>>> On 9/29/16 5:14 PM, Holger Knublauch wrote:
>>>>
>>>>
>>>> On 30/09/2016 10:06, Karen Coyle wrote:
>>>>>
>>>>>
>>>>> On 9/29/16 3:54 PM, Holger Knublauch wrote:
>>>>>> Hi Jose
>>>>>>
>>>>>> others may correct me, but my understanding is that all conformant
>>>>>> SHACL
>>>>>> validation engines need to produce all the "mandatory" fields of the
>>>>>> results format.
>>>>>
>>>>> which are sh:focusNode and sh:severity - which is a bit awkward since
>>>>> the focus node (isn't that "target node" now?) doesn't tell you what
>>>>> constraints were evaluated.
>>>>
>>>> Yes, we need to clarify the mandatory fields (see your recent ticket).
>>>
>>> I would put them first in the section, followed by the "MAY"
>>> properties, rather than mixing them. Just a bit of readability assist.
>>
>> I thought about this but have not rearranged these sections. The issue
>> is that there is a dependency between the two sections about
>> sh:severity, and I would like to keep the one about constraints at the
>> end, because it's really about the shapes graph. Maybe we should move
>> 3.4.9 somewhere else, e.g. into section 2 but then it would need a
>> forward reference into the list of available severities. I welcome
>> suggestions.
>>
>>>
>>>>
>>>> There is a subtle difference between focus node and target node:
>>>> - the focus node is the currently evaluated node
>>>> - the target node is a node specified as target by a shape
>>>> - target nodes becomes focus nodes for the duration of the validation
>>>> - but there are other ways for nodes to become focus nodes, e.g. via
>>>> sh:shape
>>>
>>> That makes sense, but it wasn't clear to me which was being referred
>>> to on reading that section. Oddly, the term "focus node" is not
>>> described in the section on validation (3.0-3.3), which however is
>>> where the focus node IS what is being validated. I suspect that at
>>> least some of the references to "node" there should instead be "focus
>>> node". E.g. in the first sentence:
>>>
>>> "The definitions for validating a data graph against a shapes graph as
>>> well as a *node* from the data graph against a shape from the shapes
>>> graph are provided below"
>>>
>>> Is that *node* a focus node? If so, it should say focus node there and
>>> in the remainder of that section. Then, 3.4.1 Focus node will make
>>> more sense.
>>
>> Ok, done.
>>
>>>
>>>
>>>>
>>>>>
>>>>>
>>>>>  They may decide to return less, but that should only be
>>>>>> an option.
>>>>>>
>>>>>> Our test cases should also include the full info, because engines 
>>>>>> that
>>>>>> only produce true or false can still use these test cases, while the
>>>>>> inverse is not the case.
>>>>>
>>>>> Since severity is mandatory, how will T/F work?
>>>>
>>>> Assuming that true means "no validations were found", then a test case
>>>> would pass if no results are produced, or at least no results with
>>>> severity violation.
>>>
>>> 3.4 says "The validation report is the result of the validation
>>> process and includes a set of zero or more validation results." Can
>>> you give an example of a validation report without validation results?
>>> If it is the absence of a validation result, I have trouble with it
>>> being called a "set", which in my mind has an identity, even when 
>>> empty.
>>
>> If the results graph contains no instances of sh:ValidationResult then
>> the set of results is empty. The term "set" is also used in the RDF 1.1
>> spec with the same meaning - a graph is a set of triples, and there may
>> not be any triples. In other words, if the results graph is empty, then
>> the validation has succeeded. Does this cover your T/F question?
>>
>>
>> @Jose, I have also added a sentence to clarify
>>
>> Only SHACL implementations that can return all of the mandatory
>> properties of the <a>Validation Results Vocabulary</a> are
>> standards-compliant.
>>
>> which may address ISSUE-181. I noticed that it was unclear whether
>> sh:sourceConstraintComponent was required or not. I have clarified that
>> it is required, assuming that this is the intention of the WG. This
>> property is basically the key to interoperability as well as making sure
>> that the correct violations have been produced by an engine.
>>
>> @Karen, I will respond on the original ISSUE-182 in a separate email.
>>
>> Thanks,
>> Holger
>>
>>
>>>
>>> Thanks,
>>> kc
>>>
>>>>
>>>> Holger
>>>>
>>>>
>>>>>
>>>>> kc
>>>>>
>>>>>>
>>>>>> Holger
>>>>>>
>>>>>>
>>>>>> On 29/09/2016 19:59, RDF Data Shapes Working Group Issue Tracker
>>>>>> wrote:
>>>>>>> shapes-ISSUE-181: SHACL conformance for partial validation reports
>>>>>>> [SHACL Spec]
>>>>>>>
>>>>>>> http://www.w3.org/2014/data-shapes/track/issues/181
>>>>>>>
>>>>>>> Raised by: Jose Emilio Labra Gayo
>>>>>>> On product: SHACL Spec
>>>>>>>
>>>>>>> When preparing the test-suite, it is not clear to me if we have to
>>>>>>> declare/check all the validation reports that must be returned by a
>>>>>>> SHACL processor or just a true/false.
>>>>>>>
>>>>>>> The spec contains the following phrase:
>>>>>>>
>>>>>>> "The validation process returns a validation report containing all
>>>>>>> validation results. For simpler validation scenarios, SHACL
>>>>>>> processors
>>>>>>> SHOULD provide an additional validation interface that returns only
>>>>>>> true for valid or false for invalid."
>>>>>>>
>>>>>>> A SHACL processor that wants to handle use case 3.31
>>>>>>> (https://www.w3.org/TR/shacl-ucr/#uc34-large-scale-dataset-validation) 
>>>>>>>
>>>>>>>
>>>>>>> about validating very large datasets may decide to return just the
>>>>>>> first violation it finds, instead of continue processing/generating
>>>>>>> all the possible violations.
>>>>>>>
>>>>>>> Is that SHACL processor conformant with the spec? In that case, 
>>>>>>> when
>>>>>>> defining the test-suite, is it enough if we just declare
>>>>>>> true/false as
>>>>>>> the possible result of SHACL validation? Or if a SHACL processor
>>>>>>> returns just the first violation report that it finds?
>>>>>>>
>>>>>>> In any case, I think the spec should be more clear about when a 
>>>>>>> SHACL
>>>>>>> processor is conformant or not if it doesn't return all the 
>>>>>>> violation
>>>>>>> reports and just returns the first one or signals that there was an
>>>>>>> error.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>
Received on Friday, 30 September 2016 04:02:39 UTC