Re: General confusion on lists and implication from Gregg Kellogg on 2020-08-12 (public-n3-dev@w3.org from August 2020)

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Wed, 12 Aug 2020 12:04:14 -0700
To: Doerthe Arndt <doerthe.arndt@ugent.be>
Cc: public-n3-dev@w3.org
Message-Id: <5E6F8400-C634-4356-94DD-350B7D30090E@greggkellogg.net>
Thanks Doerthe, that helps. ...
> On Aug 12, 2020, at 3:29 AM, Doerthe Arndt <doerthe.arndt@ugent.be> wrote:
> 
> Dear Gregg, 
> 
> I will try to answer some questions here.
> 
>> I’m hoping for some insight into the defined semantics of log:implies and various list tests I’m working through. (Sorry, long email).
>> 
>> Lists in N3 are a bit challenging (to me, anyway), as they have a dual nature of first-class resources and first/rest ladders. In the first case, when given a list in the antecedent, it can be considered to be a constant of that formula, in the second, a pattern of existential variables connecting to resources in the knowledge-base.
>> 
>> The working definition of log:implies I’m working off is the following:
>> 
>> Logical implication is the relation between the antecedent (subject) and conclusion (object) of a rule. The application of a rule to a knowledge-base is as follows. For every substitution which, applied to the antecedent, gives a formula which is a subset of the knowledge-base, then the result of applying that same substitution to the conclusion may be added to the knowledge-base.
>> 
>> (The definition in Notation 3 Logic is somewhat different).
> Depending on how "substitution" is meant, the above is not that different from N3's semantics.
> 
That statement essentially comes from the vocabulary definition of log:implies
> If you assume universals to be quantified on top level, you can do what you say here. It is not clear for me what you do with blank nodes.
> 
Universals are either quantified in the formula containing @forAll, or in the parent of the formula containing a quickvar (e.g. ?x). Existentials in the form of a blank node are quantified in the formula in which they appear, distinct from synonyms in other formulae, which I believe is our semantics. I calculate the solutions of a formula using the top-level formula as the knowledge-base, and construct a query using any statements including variables (existential or universal) as the patterns in a Basic Graph Pattern. When run against the knowledge-base, this results in zero or more solutions where the variables contained in those patterns are bound to values from the knowledge-base. In the case where existential variables are not bound, they revert to blank nodes. At least, this is my current implementation.

The solutions of that BGP are those where all universal variables appearing in patterns are bound, which is empty in the case that not all are bound. It may result in a single solution with no bound variables at all, which is distinct from no solutions. If there are any solutions in the result, the consequent is evaluated using the universals in each solution.

Extracting the statements not including variables, I believe, is equivalent to considering those statements as being part of the knowledge-base, which is why I get the results I do.
> If your substitution is only for non-nested blank nodes, you are again in line with our definition. For built-in functions you can now imagine that there is an infinite amount of additional triples implicitly stated in the knowledge base like for example 
> 
> (1 1) math:sum 2.
> (1 2) math:sum 3.
> (1 3) math:sum 4.
> etc.
> 
> and also
> 
> 1 list:in (1 2 3 4 5).
> 2 list:in (1 2 3 4 5).
> etc.
> 
Yes, but for an implementation, considering the knowledge-base to contain an infinite number of triples isn’t too useful :P. I think by including the non-variable statements from the formula being evaluated as part of the knowledge-base has the same affect; this may extend to statements including only existential variables even if those variables are not bound during evaluation (_:someone a :Person would seem to be true even if there is no specific person identified in the knowledge-base).

>> So, this would imply that for an antecedent containing a list, that that list must also be in the knowledge-base, but this may be a misunderstanding of exactly what constitutes the knowledge-base, and is inconsistent with tests.
> OK, yes, here is the difference. As stated above, I would consider built-ins in a certain sense as some extra true triples in the knowledge base which you cannot all explicitly state since there is an infinite number of true triples (as you easily see with my sum example).

I don’t quite understand, I think you’re saying that in the examples above, because the lists, values and operands are all constant, and are used within the list:in built-in, they are true. What if the triple :Superman :is :ClarkKent appears in a formula, is it not also part of the knowledge-base at least for the purpose of evaluating that formula?

>> 
>> On test taken from swap/test/list/last.n3 includes the following implication:
>> 
>> { 1 list:in  (  1 2 3 4 5 ) } => { test4a a SUCCESS }.
>> 
>> However, the containing (base) formula does not, itself, contain the list (  1 2 3 4 5 ). This could be interpreted as 1 list:in  (  1 2 3 4 5 )  being a constant, and the truth of the assertion is  true without examining the knowledge-base, but it fails the requirement that (after substitution), the antecedent formula is a subset of the knowledge-base, or perhaps constants in the antecedent are inherently parts of the knowledge base (or all patterns which become constant after replacing variables with those already bound), but this would seem to imply some merge process that I haven’t seen reference to.
> So, I would say: everything which is explicitly stated in the knowledge base is true (as you also state) and for built-in functions everything which can be evaluated as true should also be considered true. 
> 
> Your remark reminds me that I actually should also start to write down the semantics of built-ins, currently they are not in our semantics.
> 
Okay, that makes sense.

>> Alternatively, the following is somewhat equivalent;
>> 
>> { 1 list:in  [rdf:first 1; rdf:rest ( 2 3 4 5 )] } => { test4a a SUCCESS }.
>> 
>> Except that here, but implicit blank node is considered as an existential variable, and so must match something in the knowledge-base, and is not itself a fact.
> This is not true. If you have a blank node here it indeed needs indeed to be understood as an existentially quantified variable. If you would add explicit quantification, you would get:
> 
> { ∃ :x. 1 list:in  :x. :x rdf:first 1; rdf:rest ( 2 3 4 5 ) } => { test4a a SUCCESS }.
> 
> This is actually the same as saying
> 
> ∀ :x. { 1 list:in  :x. :x rdf:first 1; rdf:rest ( 2 3 4 5 ) } => { test4a a SUCCESS }.
> 
> And this can be translated back into 
> 
> { 1 list:in  ?x. ?x rdf:first 1; rdf:rest ( 2 3 4 5 ) } => { test4a a SUCCESS }.
> 
> which gives you the exact same problem as before.
> 
Okay, that invalidates my thesis from above, that unbound existential quantifiers can be treated as blank nodes when evaluating the formula, but my implementation doesn’t create separate resources for lists outside of the evaluation context, so I need to infer the semantics of the first/rest chains based on the content, but am aided by my interpretation of unbound existential quantifiers.

I’ve worked with always maintaining lists as first-class resources but it creates a number of problems that would need to be resolved.
> Your question now makes me wonder how your substitution works (as mentioned above, it cold be in-line with our semantics definition). So some example:
> 
> How would you deal with
> 
> {_:x a :Unicorn}=> {:Unicorn a :existingSpecies.}.
> 
> Does your rule trigger for
> 
It triggers with noting in the knowledge-base, due to my treatment of existential quantifiers as stated above.
> 1.  :tom a :Unicorn.
> 
Same thing: :Unicorn a :existingSpecies .
> 2. _:x a :Unicorn.
> 
It would bind to the _:x, but that’s not obvious from the result: [a :Unicorn] .
> 3. _:y a :Unicorn.
> 
Same result.
> Your answer would help me to understand your intuition here. Maybe I can then also better explain in the spec how to deal with blank nodes (additionally to the definition).
> 
Probably the area for me to focus on is how to deal with blank nodes in the antecedent which do not bind to entries in the knowledge-base.

>> Another way to test this would be with the following:
>> 
>> (  1 2 3 4 5 ).
>> { 1 list:in  (  1 2 ?three 4 5 ) } => { :result :contains ?three }.
>> 
>> Should this fire :result :contains 3. ? Is the the list a constant including a variable, or is it a pattern to match a sequence of statements in the knowledge-base?
> The problem with your example case is that  (  1 2 3 4 5 ). is not a triple so it has no meaning (according to our semantics but also following Cwm's and EYE's interpretation).
> 
Using the first/rest ladder, it is actually 11 statements, but as you point out, that’s not really relevant.
> But what would indeed work with these two reasoners is:
> 
> :greggsList :is (  1 2 3 4 5 ).
> {:greggsList :is ?list. ?list log:equalTo (  1 2 ?three 4 5 ). 1 list:in  (  1 2 ?three 4 5 ) } => { :result :contains ?three }.
> Here, both reasoners give you
> 
>     :result     :contains 3 .
> 
> This would also be in-line with our semantics (in how far it is defined). But I agree that this example is not that interesting any more since it would now also work without the list:in. 
> 
> Another example could be 
> 
>     { 1 list:in  (  1 2 ?three 4 5 ) } => { :test :is :successful}.
> 
> Here, EYE indeed derives
> 
>     :test :is :successful.
> 
> This is also what I would expect according to the semantics, Cwm does not give a result and I suspect that this is because the ?three is not specified.
> 
>> 
>> I have yet to get to a list:append implementation, which would certainly be something that is intended to construct information not already contained in the knowledge-base.
> I would again say that all the true triples are implicitly in the knowledge-base, but for the case of append I would also recommend to have a look at Prolog. At least from my own experience I would say that append is one of the most important predicates there so you will find a lot of explanations how implementations deal with that function.
> 
Yes, I should look more at Prolog, but I think our data and processing model should probably stand on its own. For me, the issue is where to the triples constituting the appended list actually go? With the List as resource interpretation, not such an issue, as you can just bind the entire list to the variable, which is pretty much the way I expect to handle it, but the resource vs. first/rest ladder views complicate things, and complicate our relationship with RDF in general. Certainly an area for consideration in the semantics.

>> Another hypothesis I’ve had some success with is not to treat the resulting antecedent formula as something which must be a subset of the knowledge-base, but to instead see if all of the universal variables contained in the antecedent are bound after evaluation. This (mostly) has the same affect, but allows the antecedent to contain constant statements that aren’t, themselves, in the parent formula.
> I am not entirely sure whether I understand you here. Would you think that corresponds to how I explained it above. From the logical point of view the formula should fire for all formulas which are known to be true. We know that everything which is explicitly stated in the knowledge base is true and from the definitions of the built-in functions we know which built-in triples are true.
> 
See my elaboration above. Right now, it works, but as I get deeper into my implementation, I may run into roadblocks.

>> I think there’s a similar line or reason for formulae, themselves.
>> 
>> For my part, it would be great if the semantics document clarified this, in a way which is helpful to developers, such as myself. Much of my implementation leverages the BGP matching, and builtin functions of my SPARQL implementation. But, it’s been a bit of trial and error for me thus far.
> If you help me with the clarifications, I am willing to write it down. I just really need your help to understand possible problems since I come from the Prolog-worl which also means that I have an intuition for most of the functions and no "fresh" view any more.

Of course, I’m happy to help. I think we need to strike a balance from the theoretical and practical descriptions, along the lines of the SPARQL Algebra, which is already pretty abstract.

Thanks for your considered responses, This has all been quite a learning experience for me, and I have a long way to go.

Gregg
> 
> I hope that at least some of my answers help you.
> 
> Kind regards,
> Doerthe
> 
> -- 
> Dörthe Arndt
> Researcher Semantic Web
> imec - Ghent University - IDLab | Faculty of Engineering and Architecture | Department of Electronics and Information Systems
> Technologiepark-Zwijnaarde 122, 9052 Ghent, Belgium
> t: +32 9 331 49 59 | e: doerthe.arndt@ugent.be <mailto:doerthe.arndt@ugent.be>
Received on Wednesday, 12 August 2020 19:04:31 UTC