RE: Question on Handling of Ill-formed Shapes Graphs

Holger,

On Sunday, February 19, 2017 11:58 PM, Holger Knublauch [mailto:holger@topquadrant.com] wrote:
> 
> On 18/02/2017 0:58, Svensson, Lars wrote:
> > Hello Holger,
> >
> > Apologies if I stirred up a hornet's nest... I thought it would be fairly uncontentious.
> >
> > On Thursday, February 16, 2017 10:55 PM, Holger Knublauch
> [mailto:holger@topquadrant.com] wrote:
> >
> >> there are two major reasons for the current wording, basically due to
> >> the complexity of the many syntax rules:
> >>
> >> 1) If we were to make it a MUST then each SHACL implementation would
> >> have to implement all the syntax rules, and we as the WG would need to
> >> define test cases for all kinds of invalid structures. The SHOULD lowers
> >> the barrier of entry and the formal process issues significantly.
> > I don't quite grasp this. The way I read your explanation, it would mean that a
> SHACL implementation is not required to implement all the syntax rules. That is a
> contradiction to the conformance criteria [1] for "SHACL Core processors as
> processors that support validation with the SHACL Core Language" which in my reading
> means that a conformant SHACL Core processor MUST support all syntactic rules in the
> SHACL Core Language (and similarly a conformant SHACL-SPARQL processor MUST
> support all syntactic rules in the SHACL-SPARQL Language).
> 
> That's not how I would read the conformance section. SHACL Core is
> explicitly not required (via MUST) to do syntax checking - so compliant
> processors merely MUST support validation following these rules.

Right, but implementations are required to _support_ all syntax rules. My remark was not targeted at the spec but at your comment, where you say "If we were to make it a MUST then each SHACL implementation would have to implement all the syntax rules" and I think we all agree that a conforming implementation MUST implement and understand all syntax rules even if its behaviour in case of an ill-formed shape is not specified.
 
> >
> > To me Peter's suggestion to split between SHACL Core and SHACL-SPARQL syntax
> checking sounds sound at the _conceptual_ level. I do understand the point though,
> that it lowers the barrier of entry at the _technical_ level and particularly the formal
> process.
> >
> >> 2) It would require validation (for well-formedness) of the shapes graph
> >> and this is a very expensive operation. In many scenarios such as
> >> interactive data entry tools, the shapes graph is identical to the data
> >> graph (or at least is part of the imports closure). If you make an edit,
> >> then the shapes may become invalid. This means that a validator would
> >> have to perform checking of the shapes before each validation, and this
> >> is prohibitively expensive in cases like form validation in real time,
> >> for each instance.
> > Thank you, this is the kind of case I was looking for.
> >
> > What worries me with the current text is that a user could be made believe that if
> the result of a SHACL validation process is undefined when the shapes graph is ill-
> formed the processor could in fact return sh:conforms true. One solution could be to
> add the requirement that if a SHACL processor does not produce a failure in the case
> of an ill-formed graph, it MUST NOT produce a result with the value sh:conforms true.
> (I. e. the default result of such an processor must be sh:conforms false).
> 
> The Shapes WG has not defined an API for SHACL. This is significant,
> because it does not prescribe a programmatic interface for applications.
> Each implementation will offer its own interfaces and different
> parameters. I expect that implementations will offer flags to indicate
> what levels of features should be activated. In TopBraid's engine, we
> have a "flag" (via a SHShape filter) that activates or deactivates
> checking of the shapes themselves.

OK, that makes sense.

> As a result of this, the caller of the API already knows whether it will
> perform shapes validation or not. Therefore I don't see why we would
> need to explicitly report this back.

OK.

> Overall I believe the usual workflow will be that people develop their
> shapes and use a meta validator until the shapes graph is valid. Only
> then they put it into practice, with a set of shapes that are already
> tested for correctness. There is no need to test this over and over again.

The question is what happens if a processor downloads an ill-formed shape from a third party website as part of the processing. I'm talking about the following data flow:

1) User Agent requests an RDF graph from a server
2) Server returns the requested graph and claims that the graph conforms to a specific shape (e. g by referencing the shape in the RDF graph or through an http header or whatever). The shapes graph can reside on a third party website
3) The UA requests the shapes graph from the (third-party) server
4) The server returns the shapes graph
5) The UA validates (or delegates the validation of) the data graph against the shapes graph.

In this case, the UA (nor the user it proxies for) cannot know beforehand if the shapes graph is well-formed or not. I don't think that this is a corner case but that in will be very common.

But yes, in those cases I guess that the UA will _always_ perform validation. Or (probably more performant) validate the shapes graph the first time and then only when it detects that the shapes graph has changed.

That said, I still propose to add the requirement that if a SHACL processor does not produce a failure in the case of an ill-formed graph, it MUST NOT produce a result with the value sh:conforms true. (I. e. the default result of such an processor must be sh:conforms false). That would make the whole system more robust.

Best,

Lars

Received on Monday, 20 February 2017 10:49:32 UTC