Re: shapes-ISSUE-182 (Validation report): [Editorial] Clarifications need to section 3.0 from Holger Knublauch on 2016-10-06 (public-data-shapes-wg@w3.org from October 2016)

From: Holger Knublauch <holger@topquadrant.com>
Date: Thu, 6 Oct 2016 13:58:03 +1000
To: public-data-shapes-wg@w3.org
Message-ID: <850b0655-d2b5-f759-2b64-8f2744e92b15@topquadrant.com>
On 1/10/2016 11:23, Karen Coyle wrote:
>
>
> On 9/29/16 8:57 PM, Holger Knublauch wrote:
>>
>>
>> On 30/09/2016 1:40, RDF Data Shapes Working Group Issue Tracker wrote:
>>> shapes-ISSUE-182 (Validation report): [Editorial] Clarifications need
>>> to section 3.0
>>>
>>> http://www.w3.org/2014/data-shapes/track/issues/182
>>>
>>> Raised by: Karen Coyle
>>> On product:
>>>
>>> Section 3.0 on validation talks about the validation results, but
>>> doesn't explain clearly which properties are required and which are
>>> optional. It also should refer to the shapes graph as the source of
>>> the properties, not just to their appearance in the report. Some
>>> examples:
>>>
>>> "3.4.1.3 Value (sh:value)
>>>
>>> Validation results may have a value for the property sh:value pointing
>>> at a specific node that has caused the result."
>>>
>>> - it isn't clear if sh:value MUST be returned if sh:value is coded in
>>> the constraint, or if echoing back sThanh:value when it exists is 
>>> itself
>>> optional.
>>
>> I have added some prose into 3.4.3 to clarify how this property is
>> populated. I hope this clarifies that sh:value is not coded in the
>> constraint but is dynamically populated from the data graph.
>
> Thanks, Holger. However, I'd change the wording from:
>
> "Validation results may have a value for the property sh:value 
> pointing at a specific node that has caused the result. "
>
> to:
>
> "Validation results MAY include a property sh:value. The property 
> takes as its object the specific node in the data graph that caused 
> the result. This object can be any RDF term (IRI, literal, or blank 
> node)."
>
> Then I'd leave off the "for example" part, but it doesn't hurt anything.

A problem here is that the term "property" is overloaded.
1) "property" referring to the rdf:Property itself (which is my default 
understanding)
2) "property" referring to a specific object.
But for the latter we already have the term "property value" 
(abbreviated as "value"). In any case, a subject cannot include a property.

Furthermore, enumerating the three node kinds is redundant, because this 
is implied by the term "node".

Also, in the past when I had used MAY in all-caps, I received backlash 
because this is apparently not according to the predefined meaning of 
MAY in W3C specs. I have to confess I never understood when it's valid 
and when not.


>
> "pointing to" has the same problem as the "linking to" comment that 
> Peter first brought up. For this reason it may be best to use the 
> terminology of triples when speaking of relationships between the 
> components of triples, which are only defined positionally, not as 
> links or pointers.

Ok, using the triple-centric notation, I have tried to reformulate this to:

                         Validation results may have zero or one values 
for the property <code>sh:value</code>.
                         The <a>object</a> of a <a>triple</a> that has 
<code>sh:value</code> as its <a>predicate</a> and a validation result as 
its <a>subject</a> is the specific <a>node</a> that has caused the result.
                         For example, validation results produced as a 
result of a <code>sh:nodeKind</code> constraint use 
<code>sh:value</code> with the <a>value node</a> that does not have the 
correct node kind as its <a>object</a>,
                         while results produced due to a 
<code>sh:minCount</code> violation do not use <code>sh:value</code> 
because there is no individual node that could be mentioned.

(While this is more precise, I doubt that this contributes to readability).

>
> I assume that an sh:value with a blank node as object may not always 
> be informative, but it is a possibility.

sh:value as blank node is fine assuming that the graph maintains bnode 
ids (which almost all implementations do).

>
>
>>
>>>
>>> 3.4.1.8 Declaring the Severity of a Constraint uses "can" not "MAY",
>>> and gives the default as sh:Violation (Does that mean T/F cannot have
>>> a default?). Better wording would be:
>>>
>>> "The severity level of a constraint violation MAY be coded in the
>>> constraint of a shapes graph using the property sh:severity, which
>>> takes as its value one of the SHACL pre-defined severities, or a
>>> locally defined severity." (followed by remaining sentences)
>>
>> I have applied similar wording to 3.4.8.
>
> I don't see changes in those sections - did the changes actually go in?

I did not use your exact wording, but please verify whether you can live 
with 3.4.8 and 3.4.9 now. I didn't use the term "locally defined 
severity" because it will open more questions such as "in which graph". 
So I went with that it can be any IRI.

>
>>
>>>
>>> Also, the example given shows the shapes graph, but would be more
>>> informative if it also included the validation report that results.
>>
>> For this to happen, I would also need to create a data graph, then the
>> results graph. This would easily fill two pages. While I agree this
>> would be "informative", I am honestly not convinced whether this is
>> worth the effort. With every paragraph that we add, more stuff will need
>> to be reviewed (and no doubt someone will not like something about
>> them). The current example includes # comments that are IMHO clear
>> enough about what will happen. But if you feel strongly about this, I
>> can add expand on the example.
>
> I do think that such an example would be good. It may be possible to 
> show snippets rather than whole SHACL documents. Another option is to 
> put examples at the end of the section. (The turtle syntax document 
> does this.)

I have extended the example as you have requested.

>
>>
>>>
>>> Note that examples throughout do not include sh:severity or sh:message
>>> in constraints, which requires some explanation, perhaps in the
>>> introductory area where examples are described. (I presume that it is
>>> expected that most or many constraints will include a severity, so it
>>> would be a normally occurring property, and that sh:message will also
>>> be common.)
>>
>> I have added two sentences enumerating the mandatory properties:
>>
>>                     The properties <code>sh:focusNode</code> and
>> <code>sh:severity</code> are the only mandatory properties of all
>> validation results.
>>                     The property
>> <code>sh:sourceConstraintComponent</code> is mandatory for validation
>> results produced by violations of <a>constraint components</a>.
>>
>> I hope this addresses the role of mandatory vs optional properties?
>
> Yes, I believe it does. Thanks.
>
>>
>>>
>>> The Example validation report in section 2.2 (Filter shapes) has
>>> sh:severity and sh:message although those are not shown in the shapes
>>> graph.
>>
>> sh:severity is optional and therefore not shown (the default is
>> sh:Violation).
>> sh:message is automatically produced by the engine, although I have
>> recently opened a ticket to also allow it at individual constraints.
>>
>> So technically that's all OK. I could add an explanation about where
>> these properties are coming from, but that's kinda repetitive and would
>> require forward-references to later sections.
>
> You could comment in the example with something like "# See Violations 
> section" - that way, people know that it's something that will be 
> explained later, but it doesn't take up much space.

Added a forward reference as a sentence right before the example.

My latest round of edits is

https://github.com/w3c/data-shapes/commit/71073d04b640b3d9fd646f0a977e4fb9d86cc00d

>
>>
>>
>> As always, it is possible to have different opinions about such
>> editorial changes. If you can live with the current state, let me know
>> so we can close this ticket. Otherwise, please respond with what else
>> needs to be changed.
>
> I discovered where it is said (although not quite directly) that a 
> validation report is only created when the focus node is NOT valid, 
> i.e. fails to meet the criteria of the constraints: it's in the 
> terminology section in the box that begins "Data Graph, Shapes 
> Graph...". It says there:
>
> " A node in a data graph is said to validate against a shape if 
> validation of that node against the shape neither produces any 
> validation results nor results in a failure."
>
> That's a bit subtle, and if nothing else it also needs to be said in 
> the actual validation section. But I think it should be clearer and it 
> say that a validation report is only produced for focus nodes that 
> *fail* to validate against the constraints. It needs to be very clear 
> that the validation report is not a report of all of the results of 
> the validation process, but only a report on the failures. Its name 
> does not imply that; actually, it implies that it is reporting on the 
> results of the act of validation, which has at least two possible 
> outcomes: pass/fail (or T/F). So it is important to make this point.

The validation report *is* a report of all results. Failures are 
"exceptions" that basically stop the process and do not produce a report 
at all. Failures are reported by different channels. With this, are you 
able to point at specific changes that need to be made to resolve the issue?

Thanks,
Holger


>
> kc
>
> p.s. Note that I am always willing to actually make the changes myself 
> in my fork if that is easier than doing this through email.
>
>
>>
>> Thanks,
>> Holger
>>
>>
>>
>
Received on Thursday, 6 October 2016 03:58:37 UTC