Re: Question on Handling of Ill-formed Shapes Graphs

I have raised https://www.w3.org/2014/data-shapes/track/issues/233 to 
discuss this in the WG.

Holger


On 20/02/2017 20:32, Svensson, Lars wrote:
> Holger,
>
> On Sunday, February 19, 2017 11:58 PM, Holger Knublauch [mailto:holger@topquadrant.com] wrote:
>> On 18/02/2017 0:58, Svensson, Lars wrote:
>>> Hello Holger,
>>>
>>> Apologies if I stirred up a hornet's nest... I thought it would be fairly uncontentious.
>>>
>>> On Thursday, February 16, 2017 10:55 PM, Holger Knublauch
>> [mailto:holger@topquadrant.com] wrote:
>>>> there are two major reasons for the current wording, basically due to
>>>> the complexity of the many syntax rules:
>>>>
>>>> 1) If we were to make it a MUST then each SHACL implementation would
>>>> have to implement all the syntax rules, and we as the WG would need to
>>>> define test cases for all kinds of invalid structures. The SHOULD lowers
>>>> the barrier of entry and the formal process issues significantly.
>>> I don't quite grasp this. The way I read your explanation, it would mean that a
>> SHACL implementation is not required to implement all the syntax rules. That is a
>> contradiction to the conformance criteria [1] for "SHACL Core processors as
>> processors that support validation with the SHACL Core Language" which in my reading
>> means that a conformant SHACL Core processor MUST support all syntactic rules in the
>> SHACL Core Language (and similarly a conformant SHACL-SPARQL processor MUST
>> support all syntactic rules in the SHACL-SPARQL Language).
>>
>> That's not how I would read the conformance section. SHACL Core is
>> explicitly not required (via MUST) to do syntax checking - so compliant
>> processors merely MUST support validation following these rules.
> Right, but implementations are required to _support_ all syntax rules. My remark was not targeted at the spec but at your comment, where you say "If we were to make it a MUST then each SHACL implementation would have to implement all the syntax rules" and I think we all agree that a conforming implementation MUST implement and understand all syntax rules even if its behaviour in case of an ill-formed shape is not specified.
>   
>>> To me Peter's suggestion to split between SHACL Core and SHACL-SPARQL syntax
>> checking sounds sound at the _conceptual_ level. I do understand the point though,
>> that it lowers the barrier of entry at the _technical_ level and particularly the formal
>> process.
>>>> 2) It would require validation (for well-formedness) of the shapes graph
>>>> and this is a very expensive operation. In many scenarios such as
>>>> interactive data entry tools, the shapes graph is identical to the data
>>>> graph (or at least is part of the imports closure). If you make an edit,
>>>> then the shapes may become invalid. This means that a validator would
>>>> have to perform checking of the shapes before each validation, and this
>>>> is prohibitively expensive in cases like form validation in real time,
>>>> for each instance.
>>> Thank you, this is the kind of case I was looking for.
>>>
>>> What worries me with the current text is that a user could be made believe that if
>> the result of a SHACL validation process is undefined when the shapes graph is ill-
>> formed the processor could in fact return sh:conforms true. One solution could be to
>> add the requirement that if a SHACL processor does not produce a failure in the case
>> of an ill-formed graph, it MUST NOT produce a result with the value sh:conforms true.
>> (I. e. the default result of such an processor must be sh:conforms false).
>>
>> The Shapes WG has not defined an API for SHACL. This is significant,
>> because it does not prescribe a programmatic interface for applications.
>> Each implementation will offer its own interfaces and different
>> parameters. I expect that implementations will offer flags to indicate
>> what levels of features should be activated. In TopBraid's engine, we
>> have a "flag" (via a SHShape filter) that activates or deactivates
>> checking of the shapes themselves.
> OK, that makes sense.
>
>> As a result of this, the caller of the API already knows whether it will
>> perform shapes validation or not. Therefore I don't see why we would
>> need to explicitly report this back.
> OK.
>
>> Overall I believe the usual workflow will be that people develop their
>> shapes and use a meta validator until the shapes graph is valid. Only
>> then they put it into practice, with a set of shapes that are already
>> tested for correctness. There is no need to test this over and over again.
> The question is what happens if a processor downloads an ill-formed shape from a third party website as part of the processing. I'm talking about the following data flow:
>
> 1) User Agent requests an RDF graph from a server
> 2) Server returns the requested graph and claims that the graph conforms to a specific shape (e. g by referencing the shape in the RDF graph or through an http header or whatever). The shapes graph can reside on a third party website
> 3) The UA requests the shapes graph from the (third-party) server
> 4) The server returns the shapes graph
> 5) The UA validates (or delegates the validation of) the data graph against the shapes graph.
>
> In this case, the UA (nor the user it proxies for) cannot know beforehand if the shapes graph is well-formed or not. I don't think that this is a corner case but that in will be very common.
>
> But yes, in those cases I guess that the UA will _always_ perform validation. Or (probably more performant) validate the shapes graph the first time and then only when it detects that the shapes graph has changed.
>
> That said, I still propose to add the requirement that if a SHACL processor does not produce a failure in the case of an ill-formed graph, it MUST NOT produce a result with the value sh:conforms true. (I. e. the default result of such an processor must be sh:conforms false). That would make the whole system more robust.
>
> Best,
>
> Lars

Received on Tuesday, 21 February 2017 01:24:26 UTC