ISSUE-71 (was: ISSUE-176: Rules will not modify the data graph) from Holger Knublauch on 2016-09-06 (public-data-shapes-wg@w3.org from September 2016)

From: Holger Knublauch <holger@topquadrant.com>
Date: Tue, 6 Sep 2016 11:16:10 +1000
To: public-data-shapes-wg@w3.org
Message-ID: <b79402c9-aa9d-82f2-2f3e-dec95e45f12f@topquadrant.com>
On 6/09/2016 1:13, Karen Coyle wrote:
> ISSUE-71 is about validation that takes place within the SHACL 
> validation workflow. If it were not, then it wouldn't be an issue and 
> a requirement.

I believe what you are referring to (also on your edits to ISSUE-71 on 
the Proposals page [1]) had previously been discussed under ISSUE-80 [2] 
which was closed by introducing sh:stem. Although we had discussed the 
issue of de-referencing resources at runtime a couple of times, I 
believe the consensus was that this is opening a whole lot of complexity 
and that such a feature is too big and too unwieldy for the Core language.

What ISSUE-71 was originally motivated by is the case where large bodies 
of data are behind a SPARQL endpoint, and it would be very slow if the 
engine would have to live outside of the endpoint and pass thousands of 
SPARQL queries back and forth during validation. Instead, the idea would 
be to have a single transaction to send the whole shapes graph over, 
just waiting for the results to get back. Like rules, this would happen 
*before* validation, as a completely separate process.

Where your use case of de-referencing and ISSUE-71 overlap is that once 
the SHACL Protocol would be defined, then it would arguably be a easier 
to define a mechanism in which shapes are validated remotely. Off the 
top of my head, a solution could be to have a marker property, say 
sh:external, to indicate that a given shape must be validated in a 
different place. For example, assuming that the address of zip codes is 
remote and not part of the current data graph:

ex:AddressShape
     a sh:Shape ;
     sh:property [
         sh:predicate ex:zipCode ;
         sh:class zip:Code ;
         sh:external [
             a sh:SHACLProtocol ;
             sh:server <http://zipcodes.org/shacl> ;
         ]
     ] .

Above, the property sh:external would signal that the surrounding 
property constraint must be executed against a given remote server that 
supports the yet-to-be-defined SHACL protocol. There will be other 
"external" sources such as other graphs in the same data sets or vanilla 
SPARQL endpoints (that would translate to a SERVICE keyword in SPARQL). 
And yet another, much simpler, strategy would be to have the engine 
download the missing resources beforehand, and add them to the data 
graph. Standard Linked Data practices for that already exist. I believe 
this work-around was the preferred solution of the WG for the time being.

All this is certainly an interesting area but it requires someone who 
doesn't just have rough ideas (like myself) but who is willing to spend 
considerable time on this topic to write it all down and work out the 
details. It would likely become a separate deliverable. So if anyone is 
interested to have their name on such a W3C publication, please step up. 
Without commitment, it's just another unaddressed requirement due to 
lack of resources in the group.

Back to the topic of rules, having them as a separate deliverable is of 
course also an option. If we do this, then I would hope that we can at 
least mint some reserved URIs in the sh: namespace, to make the syntax 
easier to use.

Karen, I acknowledge your use cases, please also acknowledge mine.

Thanks,
Holger

[1] 
https://www.w3.org/2014/data-shapes/wiki/Proposals#Issue_71:_SHACL_Endpoint_Protocol
[2] https://www.w3.org/2014/data-shapes/track/issues/80


>
> kc
>
> On 9/4/16 3:13 PM, Holger Knublauch wrote:
>> I cannot follow this train of thought. According to that logic, the
>> SHACL network prototol ISSUE-71 (that you seem to want) cannot be part
>> of SHACL either. We should standardize what is *useful*, not because of
>> some artificial boundaries. Rules are the most popular feature in SPIN,
>> and here is an opportunity to make SHACL more useful at low cost. Rules
>> are in the same category as other forms of entailment, which are
>> officially part of SHACL, see sh:entailment.
>>
>> Holger
>>
>>
>> On 5/09/2016 3:42, Karen Coyle wrote:
>>> If it happens BEFORE the invocation of a SHACL graph/data graph
>>> comparison, then it cannot be part of the SHACL standard. After all,
>>> we haven't included the creation of explicit rdf:type statements
>>> within SHACL.
>>>
>>> kc
>>>
>>> On 8/31/16 11:59 PM, Holger Knublauch wrote:
>>>> From the recent meeting minutes I can see that Ted remarked [1]
>>>>
>>>> long ago we decided that SHACL engines would be fed a graph which it
>>>> would validate, and that SHACL engines would not change that graph
>>>> before validation ... but this reverses that and re-opens many past
>>>> decisions
>>>>
>>>> I agree with the previous decision and notice that the wording in the
>>>> proposed section was not clear. I have changed the wiki page to 
>>>> clarify
>>>> that the execution of rules happens *before* the data graph is 
>>>> produced,
>>>> i.e. the data graph is the result of applying rules on some other
>>>> "input" graph. Rules will not modify the data graph, but operate in 
>>>> the
>>>> same way that other entailments are implemented.
>>>>
>>>> Holger
>>>>
>>>> [1] https://www.w3.org/2016/08/25-shapes-minutes.html#item04
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>
Received on Tuesday, 6 September 2016 01:16:46 UTC