Re: Human-readable error messages in an R2RML validator from Dimitris Kontokostas on 2015-03-23 (public-data-shapes-wg@w3.org from March 2015)

From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
Date: Mon, 23 Mar 2015 16:10:53 +0200
To: Richard Cyganiak <richard@cyganiak.de>
Cc: Holger Knublauch <holger@topquadrant.com>, "public-data-shapes-wg@w3.org" <public-data-shapes-wg@w3.org>
Message-ID: <CA+u4+a0ArOrKiO1=B81QdGwmu5qd7gjststhuhxOH4fWX_vn7g@mail.gmail.com>
On Mon, Mar 23, 2015 at 2:05 PM, Richard Cyganiak <richard@cyganiak.de>
wrote:

> Hi Dimitris,
>
> > On 23 Mar 2015, at 07:45, Dimitris Kontokostas <
> kontokostas@informatik.uni-leipzig.de> wrote:
> >
> > What we further need to investigate is the different types of messages a
> SHACL engine would produce. I see the following possible types of messages:
> >
> > 1) violations of the high level vocabulary facets. e.g. minCount
> > In this case we can either let each SHACL engine produce the messages
> they want i.e. property {x} has less than {y} occurrences in shape {z}
> > or we can define a message vocabulary where each engine must|should pick
> up the template messages to use.
> > The advantage in the latter approach is that all engines will generate
> the same messages to the end user and by accepting outside contributions
> the vocabulary will be easily translated to many languages (as Jose also
> noted).
> > I don't think there is any reason for an end-user to redefine these
> messages since their interpretation is very well defined.
> >
> > 2) messages for a sh:property / sh:Shape
> > I am not sure what type of messages these should produce if we provide
> messages for the facets, maybe some high level comments only?
> > Also by wanting to display this level of messages we would need to
> somehow aggregate errors at the property / shape level and we'd miss the
> error details facets provide.
>
> I think for some facets, in particular regular expression facets, having
> custom messages is necessary. I want to tell users that “this isn’t a valid
> username” instead of “this doesn’t match the regex [a-zA-Z0-9_-]+”.
>
> Here’s an example of a message that might be associated with a shape:
>
>     "The property {property} is only allowed on term maps that generate
> literals, but the rr:termType of {resource} is not rr:Literal.”
>
> Technically speaking this only says that under certain conditions there is
> a cardinality constraint, and that constraint has been violated, but that’s
> much easier to explain to a user with a custom message.
>

This was my idea of a high level comment but the problem here is when a
sh:property defines two facets and only one fails. What happens if the
string has bindings to both facets? or we have two sparql queries in the
shape and both contain a variable that is requested in the message?

You suggested to define multiple messages and produce only the ones that
have a complete binding (all the facet bindings in the message failed)
Although this could work well it can be very confusing to end users (who
define & read shapes). RDF does not guarantee ordering and in a sh:property
with a few facets & messages the relation between a message and a (set of)
facets will not be obvious. It will get even worse if people try to provide
translations on top of that.

On the other hand it would work pretty well if we limit this approach to
properties with a single facet.



> > 3) messages for sparql queries
> > if the query does not specify the message in the sparql body how would
> it differ from the shape message or a message from another sparql query in
> the same shape? If we want this functionality we'd need to put SPARQL
> queries in intermediate/blank nodes
>
> SPARQL queries might bind extra variables that could become available for
> use in message templates.
>
> > 4) messages for property templates.
> > If we define a message vocabulary we can re-use it for defining these
> types of messages
> >
> > Finally, I also don't see any reason to formalize an intermediate node
> structure for the creation of messages.
> > Since a SHACL engine can return sh:root, sh:predicate & sh:value as
> results, we don't need anything else for an agent to post-process the
> results.
> > messages are for human consumption only and each engine can create it's
> internal intermediate structure for the message generation
>
> Formalising a data structure for validation reports may be a very simple
> affair. It may just be a matter of defining some names like ?root and
> ?value that have special meaning. A validation report is then simply some
> key-value pairs with keys like ?root and ?value. The output of a
> SELECT-style SPARQL query would directly produce a validation report, but
> other non-SPARQL methods could of course be used to produce an equivalent
> data structure.
>

Looks like we both refer to the same thing. My comment was more on the
message generation process not the key-value pairs report.

Best,
Dimitris

>
> Best,
> Richard
>
>
> >
> > Best,
> > Dimitris
> >
> >
> > On Mon, Mar 23, 2015 at 3:36 AM, Holger Knublauch <
> holger@topquadrant.com> wrote:
> > On 3/20/2015 22:01, Richard Cyganiak wrote:
> >
> > An advantage of using templates for the validation messages, rather than
> producing a string message through a SPARQL expression, is that we can
> format the nodes nicely or make things interactive. For example, the
> {object} placeholder in the R2RML validator will intelligently pretty-print
> the node in Turtle style as a prefixed name, full URI, literal, or blank
> node, using the prefix mapping of the file under validation. And in a
> hypothetical graphical environment, URI nodes in the rendered message could
> be rendered using its rdfs:label, and still be made clickable.
> >
> > Absolutely. Indeed we use SPIN label templates for the same purpose in
> various TopBraid user interfaces.
> >
> > Not having thought about it too much, my intuition is that instead of
> conditional insertion, I’d prefer the option of having multiple template
> strings on a single constraint. Only those where all placeholders have
> bound values in the validation data structure would produce a message.
> >
> > This sounds like a good idea!
> >
> > I will add some TODO item to the sh:message to make sure we revisit this
> topic once there is time for such details.
> >
> > Thanks,
> > Holger
> >
> >
> >
> >
> >
> > --
> > Dimitris Kontokostas
> > Department of Computer Science, University of Leipzig
> > Research Group: http://aksw.org
> > Homepage:http://aksw.org/DimitrisKontokostas
>
>
>


-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig
Research Group: http://aksw.org
Homepage:http://aksw.org/DimitrisKontokostas
Received on Monday, 23 March 2015 14:11:57 UTC