Re: ISSUE-68: Updated definition

On 10/03/2016 1:17, Peter F. Patel-Schneider wrote:
> On 03/09/2016 12:46 AM, Holger Knublauch wrote:
>> On 9/03/2016 18:17, Peter F. Patel-Schneider wrote:
>>> I'm pretty sure that this fails in a number of places.
>>> It can break the shared variable connection for MINUS.   (I think that FILTER
>>> is OK, but I'm not sure.)
>> Do you have an example for this?
> SELECT ?this
> WHERE { ?this ex:a ex:b MINUS { ?this ex:a ex:b } }

Ok, this requires further thought. Thanks for bringing this up. In the 
worst case, we exclude the MINUS keyword - it is rarely used and 
work-arounds (FILTER) exists. Queries that use MINUS would be reported 
as error. There are precedents for this, e.g. we also don't support 
CONSTRUCT queries. Would this be an OK minimal solution for now? We 
could revisit this if there is spare time at the end of the WG.

>>> The substitution can modify variables from different scopes, which will change
>>> results.
>> Do you have an example for this?
> SELECT ?this ?that
> WHERE { ?this ex:a ex:b
>     SELECT ?that WHERE { ?this ex:a ?that } }

The definition states that substitution also happens in nested SELECTs. 
I believe this meets user expectations, and would be needed for cases 
like sh:minCount that use a nested SELECT. I don't quite see a problem 
with the above. Do you have data to illustrate why this would cause 

>>> Skolemization in the SPARQL code means that the blank node will not match back
>>> to itself in the graph it came from.
>> Conceptually, the bnodes will also need to be skolemized in the data graph.
>> All this is an entirely conceptual definition. Actual implementations are
>> unlikely to ever use this mechanism, but instead operate on Algebra and API
>> level.
> You can't provide something that may not be possible in practice unless there
> is something that is possible in practice, and you haven't shown that there
> there is that something.  If the only way to do this is to skolemize the data
> graph as well, which wasn't even in the description, then you haven't shown
> that there is an acceptable way to handle bnodes.

I have received this input from a colleague. Here is how he would define 
the process:

    For query Q and data D, let there be a skolemization function SK
    that maps blank nodes to URIs

    SK : RDF Term -> RDF Term
        SK: blank node => URI distinct from any URI in the data and any
    other skolemization URI.
        SK: term => term    otherwise

    SK is 1-1.
    Write SKinv for the inverse function of SK which undoes the blank
    node mapping.

    Write SK(X) for SK applied to all the RDF terms in X (X can be data,
    a query or query results).

    Write Q evaluated on D as Q(D).

    Write: Q evaluated with ?v= T as Q[?v=T]

    For pre-binding of variable ?v to RDF term T, the result of evaluating

    Q[?v=t](D) = SKinv ( Q[?v=SK(t)](SK(D)) )

    i.e write SK(t) into Q to give Q1, evaluate Q1 on SK(D), then undo
    the skolemization.

This is not complete enough - it is sketch of what we could do and needs 
clearing up to be suitable for the SHACL doc. As it may take a long time 
to come up with a robust definition, we don't want to spend time on it 
unless it is an agreed way forward.

Let me add that from an implementation point of view, creating a "View" 
graph that replaces certain nodes is not a big issue. One would 
basically create a new logical dataset in which the named graph is 
substituted with a graph that has additional "added" triples and filters 
out certain "deleted" triples.

As I said in the call, we can spend any number of resources on this 
topic, depending on how detailed we want to do it. But we seem to be in 
the middle of more important discussions right now, and I personally 
don't have the time to look into all these details. Meanwhile, we'll 
have to leave the ticket open.


>> Holger
> peter

Received on Friday, 11 March 2016 02:05:19 UTC