New proposal for the semantics

Hi all,

as discussed during yesterday's call, here is the new proposal for N3's 
semantics that I have been working on. Rather than delaying the 
discussion until I find the time to put everything in a proper document, 
here is a link to my scribbling, hoping that it will be legible enough.

https://h.champin.net/nextcloud/index.php/s/QR9pxs96daamFoo/download


Below is a high level summary of the proposal, followed by some 
discussions and examples not present in the PDF. Apologies in advance 
for a long email.


      Executive summary

N3 graphs are now made of IRIs, Literals and Variables. Variables are 
not intrinsically quantified -- their quantification (and the associated 
scope) is determined by their use.

* by default, they are considered as existentially quantified, just like 
blank nodes in RDF (simple interpretations, p2 in the PDF) -- actually, 
variables occurring /only/ in quoted graphs are more like free variables 
than existentials.

* log:implies "interprets" the variables in its subject graphs as 
universally quantified. Variables that occur only in the object graph 
are considered as local existentials. This is captured by a semantic 
extension ("rule interpretation", p7 of the PDF)

* other builtins (e.g. e:forAll) may interpret variables in their 
subject and object in yet other ways, captured by additional semantic 
extensions (I have not worked on that yet)

This "schizophenia" of variables can have strange effects in a few 
corner cases (illustrated below), but my intuition is that those are 
exotic enough to not be a problem in practice.

Variables in concrete syntaxes can be represented by quick vars (?x), 
blank node labels (_:x) or anons ([...]). I would suggest that _:x and 
?x are considered as different variables. This becomes important in the 
last section of this email.


      Variables in facts / variables in rules

Consider the following graph, where the variable _:b1 occurs both at the 
top level and in a quoted graph.

     :alice :mother _:b1 .
     :alice :says { _:b1 a :NicePerson }.
     :bob  :mother _:b2.
     :bob  :says { _:b3 a :NicePerson }.
     { _:x :mother _:y; :says { _:y a :NicePerson } } => { :answer :is 
_:x }.

This would give answer :alice → a variable in quoted graphs "corefers" 
with the same variable outside any quoted graph.
However, variables in rules are treated differently:

     :alice a :Woman; :mother _:x.
     :alice :says { _:x a :NicePerson }.
     { _:x a :Woman } => { _:x a :Person }.

|=

     :alice a :Woman.

When the rule is fired, _:x is no longer considered as referring to 
Alice's mother, but as a universal variable, which can be bound to 
anything matching the condition (in that case, :alice).

Note that the same happens with my previous proposal (adopted by Dörthe 
yesterday, and implemented in EYE) → _:x in a rule is not the same as 
_:x outside a rule (although there are also subtle differences).

A consequence is that only named resources (i.e. IRIs and Literals) can 
be directly referred to by rules. One can look at it as if rules 
operated from "outside" the graph on which they are applied. Similarly, 
in the example above, another file can refer to :alice, but not to her 
mother.


      Dörthe's example 1

   {_:x a :Penguin}=>{_:x a :Bird}.
   {_:y =>_:z}=>{_:y a :Premise. _:z a :Conclusion }.

|=

    {_:x a :Penguin} a :Premise.
    {_:x a :Bird} a :Conclusion.

Here, when the second rule is fired, _:y and _:z are interpreted as 
universals (and matched against the two quoted graph), but _:x in this 
context is just a free variable inside a quoted graph.


      Dörthe's example 2

     _:x a :Penguin.
     :bob :thinks {_:x a :Bird}.
     {:bob :thinks _:y}=>{{:bob :is :wise}=>_:y}.

|=

     {:bob :is :wise} => {_:x a :Bird}.

My understanding of this example is that, if we further know that :bob 
:is :wise, then we would like to conclude that Bob's thought is true, 
i.e. that /the penguin _:x/ is indeed a :Bird. Unfortunately, this is 
not what happens in this new proposal. Just as in the example of Alice's 
mother above, the _:x in the newly generated rule will not be related to 
the penguin _:x when the rule is fired. Instead, a new blank node would 
be generated.


      Pathological example

     _:x a :Unicorn.
     { _:x a :Cat } => { _:x a :Animal }.
     { { _:s _:p _:o }=>_:c } => { _:s a :RuleSubject }.

|=

     _:x a  :RuleSubject, # derived from the rule above
              :Unicorn. # was alteady present in the graph

Here, it might be surprising that the _:x at the top level (referring to 
some unicorn) gets mixed with the _;x in the rule body (matching any 
cat). Didn't we say above that variables in rules are independent from 
their use outside the rule?

In fact, they are different only when the rule is fired. Here, the 2nd 
rule is fired, and its premise matches the 2nd triple of the graph, 
which has predicate log:implies but is not being used as a rule right 
now. So its subject and object are simple quoted graphs, with their 
variables "able to corefer" with variables at the top level.

I would argue that this example is pathological because there is not 
good reason to use the same variable in the rule and in some facts. It 
is allowed, it "works": the 2nd rule would fire correctly if we added  
":felix a :Cat." in the graph, for example. But it has no added value 
compared to using a distinct variable.

Encouraging people to use blank node labels only in facts, and quick 
vars only in rules, should keep them away from this kind of pathological 
examples. And since currently people do it because they /have/ to, this 
looks like a reasonable guideline.

     _:x a :Unicorn.
     { ?x a :Cat } => { ?x a :Animal }.
     { { ?s ?p ?o }=> ?c } => { ?s a :RuleSubject }.

|=

     ?x a :RuleSubject.  # is derived
     _:x a :Unicorn.  # _:x the unicorn is a *different* variable


That's all follks. Thanks for reading all this! Hope it makes some sense.

   pa

Received on Tuesday, 9 November 2021 21:13:51 UTC