[SHAQL Feedback] SPARQL & definition/result coupling

Hi Holger, all

First of all I think the spec is going to a good direction, it needs some
work to be coherent but this was expected for such a limited time.

As you suggested I summarize and continue our offline discussion in the
mailing list for public review.

I also want the SPARQL expressiveness in the spec but IMHO if we keep it
uncontrolled it can be messy.
My (revised based on your comments) suggestion is to allow *only* SELECT
queries where the select variables are limited to the ones defined in the
end of section 15.1.3 [1] (?root/?this, ?level, ?message, ?predicate,
?value) or any other variable is explicitly defined in the spec . An
additional requirement is that the variable ?root (or ?this on focus node
queries) is required and the query is marked as invalid if it does not
exist.

Depending on what results the user wants to get the SHAQL engine can
convert this query to
- ASK -> true / false
- SELECT COUNT(DISTINCT ?root) -> violation counts
- SELECT DISTINCT ?root -> get erroneous nodes
- SELECT DISTINCT ?root ?label [...] -> nodes with additional metadata
- CONSTRUCT { ... } -> custom
and then programatically create the results.
If people want additional metadata they can be easily added with shapes
annotations. This is something that is already working well in RDFUnit [2]
and could be adapted for SHAQL.

The reason why we *should not* allow CONSTRUCT queries is the very strong
coupling of constraint definition with validation results.
At first, construct queries cannot easily be transformed to all the above
forms and we limit SHAQL reporting expressiveness.
But, the most important part is that we hardcode the results in the shape
and this provides very little flexibility on the result representation and
hard to change when/if the spec changes.
In addition construct queries limit the results to blank nodes that some
people don't like.
Finally, when there are multiple violation values (sh:value) CONSTRUCT
creates a separate violation for each value whereas in RDFUnit I use a
- SELECT ?root ?value {...} ORDER BY ?root
and manually place all values (or any other defined annotations) in the
same violation

I would also discourage the use of ASK queries to make the spec shorter and
more consistent. People can easily write a SELECT ?root/?this instead with
all the expressive benefits this brings.

Any feedback is welcome

Best,
Dimitris


[1]
http://w3c.github.io/data-shapes/data-shapes-core/#sparql-constraints-select
[2]
https://github.com/AKSW/RDFUnit/blob/master/rdfunit-core/src/main/resources/org/aksw/rdfunit/patterns.ttl#L33-L37

-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig
Research Group: http://aksw.org
Homepage:http://aksw.org/DimitrisKontokostas

Received on Friday, 27 February 2015 08:14:50 UTC