In defence of mathematical notations (Was: Re: blank node scope - ISSUE-107 - resolve as in Semantics - hopefully on 20 March) from Antoine Zimmermann on 2013-03-14 (public-rdf-wg@w3.org from March 2013)

From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Date: Thu, 14 Mar 2013 18:45:56 +0100
To: public-rdf-wg@w3.org
Message-ID: <51420CD4.6090809@emse.fr>
To paraphrase Pat Hayes's "In defence of logic" (IJCAI 1977), I'd say 
that modern mathematical notation is the most successful language ever 
developed to express formal concepts.

I have asked for a precise, mathematical account of what is being 
defined in the "scope" section of RDF Semantics, and what I get is the 
long English prose below. Let me examine the formalisation given in the 
middle of the text.


So, we have a set SCOPE of scopes and the infinite set B of bnodes. 
There is a mapping sc from bnodes to SCOPE.

Let us write bn(G) the set of bnodes in a graph G.

An RDF graph G is a set of RDF triples such that there exists a scope s 
such that for all bnode b in bn(G), sc(b) = s. This scope is necessarily 
unique (from your axiom 1), so we can call it the scope of the graph, 
and note it sc(G).

Then there is this new concept "copy", that is neither in Concepts nor 
in Semantics. How are the readers supposed to understand anything if 
Concepts says one thing, Semantics says another thing, and the emails 
exchanged by the WG say yet another thing?

In any case, this notion of copying then taking the union is exactly 
what is defined in RDF Semantics 2004, simply formulated with other 
words. It is the operational translation of "standardising apart".

But this is not the union of RDF graphs, so either you should provide a 
different formalisation or you should revise the definition of merge.

In the end, I still do not understand what the proposal is. Peter has 
proposed that the current text in Semantics is the proposal that we 
should accept next week. If you yourself cannot formalise it correctly, 
how will the WG ne able to assess it?


Still, could you say what problem this proposal is addressing? What 
should be the impact on conformant implementation? Are there any 
concrete consequences on how implementation deal with semantics?

What I think scope is supposed to address is giving guidelines for 
applications to decide what to do with a bunch of triples they are 
confronted to: should they deal with the bunch as a single set, or 
should they separate the triples in several sets (for instance, the 
triples come from different sources)? When the triples are partitioned, 
there is no problem with the treatment of bnodes as in RDF 2004.



AZ.



Le 14/03/2013 04:29, Pat Hayes a écrit :
>
> On Mar 13, 2013, at 11:32 AM, Antoine Zimmermann wrote:
>
>> Please, could you show the mathematical definitions of all this. I
>> do not understand what is a scope with the text of Semantics.
>
> Make sure you have the latest version, as the text was tweaked last
> night to improve the clarity. In the form given there, it uses the
> idea of a syntactic scope for bnodeIDs. The notion of syntactic scope
> (the scope of a local variable or a local identifier, or a bound
> variable, in logic) is surely a common idea for any logician or
> computer scientist. (Which is why I thought it might be easier for
> most readers to define it this way.)
>
> I will try to review the current proposal as succinctly as I can, but
> it does require some care to say it correctly, keeping the two levels
> distinct.
>
> 1. An RDF graph is a set of triples. (2004)
>
> Implicit in this is that *any* set of triples can be viewed as being
> a graph. This includes 'silly' sets, such as a set of triples
> containing just one triple chosen at random from every RDF/XML
> document ever published, or the set of triples containing a URI which
> rhymes with "bong" when spoken in Icelandic. As this illustrates, not
> all *sets* of triples are RDF graphs that correspond to any actual
> RDF source or RDF document.
>
> 2. RDF graphs can be expressed using an RDF surface syntax. (2004)
>
> 3. In such a surface syntax, blank nodes may be represented by blank
> node identifiers (bnodeIDs) (2004)
>
> 4. Any RDF surface syntax MUST define the scope of bnodeIDs in that
> syntax. (New, but in fact almost universally assumed in practice
> since 2004.)
>
> 5. We require that (all the bnodeIDs used in defining the triples in)
> any graph described by such a surface syntax MUST be contained within
> a single scope. (New, but in fact capturing how RDF graphs are
> treated since 2004.) (But scopes may extend beyond a single graph, as
> they do in datasets.)
>
> 6. Two graphs described by documents with different scopes, or from
> sources defining different scopes, CANNOT share a blank node. (New,
> but often assumed since 2004, even if using a different
> terminology.)
>
> 7. The set of all triples in a given scope is called a scoped graph.
> (New definition)
>
> 8. Observation. In actual usage, what people mean when they say "RDF
> graph" is almost always a scoped graph, that is, a graph whose
> triples are described fully in a document or datastructure or source
> which defines its own bnodeID scope, so that bnodeIDs are local names
> in that document or datastructure or source. Any graph described by
> an RDF/XML or NTriples document, for example, is a scoped graph. In
> some cases, people refer to graphs which are subgraphs of a scoped
> graph. I do not know of any examples of anyone needing to consider a
> graph that is not a subset of a scoped graph.
>
> Now, you asked for a "mathematical" account of bnode scoping, and
> what I have given you above is an account which refers to syntactic
> matters in a surface syntax. Perhaps you don't feel this is
> sufficiently mathematical. OK, I can do it purely mathematically,
> entirely at the abstract level, if you prefer. (This is taken from
> http://www.slideshare.net/PatHayes/blogic-iswc-2009-invited-talk,
> starting around slide 16.)
>
> We introduce a set of things called bscopes, and a relation called
> "in" between bnodes and bscopes. (This is a different notion of scope
> than the one I have been using until now, though they are very
> closely related. In the ISWC talk, I called them 'surfaces'.) Every
> bnode is in exactly one bscope (this is the first axiom). An RDF
> graph is a set of triples **such that every bnode in the set is in a
> single bscope** (that is the second axiom), and we can then say that
> the graph is in the bscope. (This sounds like it is an extra
> condition on the 2004 graph model, but its not, since the 2004
> version simply does not mention bscopes.) We allow more than one
> graph to be in a bscope, but not for one graph to be split across
> bscopes. Two graphs in the same bscope might share a bnode, of
> course. The truth conditions refer to mappings on the bnodes in a
> bscope, as you would expect.
>
> The definition of merge in this model is, we make copies. A *copy* of
> a graph G  is an equivalent graph G' in a different bscope. The merge
> of a set S of graphs is a graph comprising copies of all the graphs
> in S, all in a single bscope, with a 1:1 mapping between bnodes in S
> and bnodes in the merge. That's it. We can define scoped graph and
> complete graphs just as before, in the obvious ways.
>
> The connection between syntactic scopes and bscopes is that every
> syntactic scope for bnodeIDs is required to define a single bscope
> for the blank nodes identified by the bnodeIDs in the scope. In fact,
> you could (Richard's idea) *define* bnodes to be pairs of a bnodeID
> and a bscope, and then show that this satisfies the axioms; but
> that's not actually necessary, and it might be confusing. (Though it
> does show that the axioms can be satisfied, if that needs showing.)
>
> (I actually like the bscope idea better, but it would require us to
> slightly tweak the definition of RDF graph, which I suspect will be
> too large a pill for the WG to swallow, which is why I havn't tried
> to get them to swallow it.)
>
> Detailed responses to your email below, in-line.
>
>
>> I can see several intepretations:
>>
>>
>> 1) there is a mapping s from the set of all blank nodes to the set
>> of scopes (and what's a scope is not specified beyond that there is
>> a set of them). So, given a bnode b, I can say what's its scope by
>> s(b).
>
> Yes. That is another way to express the bscope idea, above. (b in c)
> iff s(b)=c
>
>> I am very much against this design
>
> Can you say why? As it (1) requires no changes to the 2004 semantics
> (extensions, but no changes) (2) completely solves the issue we have
> with merging vs. unioning (3) does not change any entailments or
> truth-conditions (4) is very easy to describe and (5) apparently
> conforms better to the way RDF is actually used in practice, I would
> not be willing to give it up without seeing a very convincing
> argument against it. Your being against it does not, in itself,
> comprise such an argument.
>
>> , but it's not clear that the ED of RDF 1.1 Semantics is rejecting
>> this one (especially given the remarks that Pat made during
>> previous discussions on the topic). I would object formally to such
>> a design.
>
> Do you have any technical objections? Can you say why you would
> object formally to this design?
>
>> 2) scopes form a partition of the RDF triples, so a triple belong
>> to a single scope. A set in the partition is a complete graph. The
>> problem is that the union of two different complete graphs is not a
>> complete graph.
>
>> I don't like this design at all, although it is already much better
>> than the first one.
>
> Again, can you say why?
>
>> 3) a scope corresponds to an RDF graph, and scopes can overlap
>
> No, that completely throws away the entire point of having a scope.
> In any case, scopes *don't* overlap. If they did, there would be no
> way to know how to interpret a local variable.
>
>> (mathematically, there is a mapping M from the set of scopes to the
>> set of graphs). A graph in a scope is a scoped graph (or
>> mathematically, there exists a scope s such that M(s) contains the
>> graph). The set of triples in the graph of a scope form a complete
>> graph (M(s) is a complete graph). Possibly, the set of complete
>> graphs is closed under set union (so that the union of two complete
>> graphs is still a complete graph). This would be much better, yet
>> not completely up to my expectation
>
> Can you say what it is that you expect here?
>
>> , but there are indications that the chosen design in the current
>> ED is not this one.
>
> Indeed not.
>
>> There are probably other ways to interpret the current text.
>
> Have you got the newest version? I find it hard to see how this text
> can be understood in any other than the intended way.
>
>> I would be curious to know what would be your respective
>> formalisation, Peter and Pat, if you had to write it independently
>> of one another. I had the impression, reading some of your emails,
>> that your understanding of scope was different.
>>
>>
>> In any case, I fail to understand why scope should have any
>> consequences on the truth of a set of triples.
>
> It doesn't. But it does provide a natural extent to define the
> existential bnode mapping on. A bnodeID is now exactly like an
> existential variable bound by a quantifier which extends over the
> scope (or, if you prefer, the bnode is the quantified variable,
> extending over the bscope; although this is a bit problematic,. and
> bnodes don't have any lexical form to bind. The best way to map
> abstract bnode syntax to logic is by using Piercian graphical
> syntax.) Which is exactly the intent of the original RDF design, in
> fact, but we couldn't state it with this degree of precision at the
> time.
>
>> Thus my plea to revert to the semantics of bnodes as in Semantics
>> 2004.
>
> There is no change to the truth-conditions of a set of triples. But
> we do require that the set is (described by a document all of whose
> bnodeIDs are inside a single scope) (In a single bscope), in order to
> apply the bnode semantic rules.
>
> Pat
>
>
>>
>> If scope impacts the semantics at all, then there should be a
>> separate definition of the truth of a scoped graphs, as opposed to
>> the truth of a set of triples. Something like:
>>
>> "A scoped graph G in scope s is true in interpretation I iff there
>> exists a mapping A from the bnodes in s to resources of I such that
>> [I+A](M(s)) is true, otherwise it's false."
>>
>> Note that A is independent of the graph G, it only depends on the
>> complete graph M(s).
>>
>>
>>
>> AZ
>>
>>
>> PS: this is a bit redundent with my complete review that will
>> follow (tomorrow I hope).
>>
>>
>>
>> Le 13/03/2013 16:57, Peter Patel-Schneider a écrit :
>>> ISSUE-107 concerns what to do with blank nodes.  This includes
>>> cross-graph blank node scopes.
>>>
>>> The current draft of Semantics
>>> https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-mt/index.html
>>> includes a solution to blank node scoping.  I propose that this
>>> solution be adopted by the WG as the result of the issue.
>>>
>>> The basic idea is to introduce the notion of a blank node scope.
>>> RDF graphs within a single scope can share blank nodes, graphs
>>> not in the same scope cannot!  This makes blank-node-renaming
>>> unnecessary during graph merging.  (Of course, in a surface
>>> syntax, different blank nodes may have the same b-node name, so
>>> these names may have to be changed when merging in a particular
>>> syntax.)
>>>
>>> For graphs not in the same scope, nothing changes.   For graphs
>>> in the same scope not sharing blank nodes, nothing changes. For
>>> graphs in the same scope sharing blank nodes, these blank nodes
>>> are interpreted uniformly.
>>>
>>> This last breaks a feature of RDF, that a set of graphs entails
>>> their merge.  There is a new definition in Semantics (complete
>>> graphs) that shows when this feature is retained.
>>>
>>>
>>>
>>> This solution needs changes in Concepts, minimally introducing
>>> the notion of a blank node scope, but maybe also talking about
>>> how blank node scope can be determined by different surface
>>> syntaxes.
>>>
>>> I suppose that there is also the issue of whether all the RDF
>>> graphs in a dataset are always in the same blank node scope.  It
>>> may be that it is not reasonable to say that this is the case,
>>> because datasets are already sometimes used as if they do not
>>> share blank nodes.
>>>
>>>
>>>
>>> peter
>>>
>>
>> -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École
>> Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel
>> 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66 03
>> Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/
>>
>>
>
> ------------------------------------------------------------ IHMC
> (850)434 8903 or (650)494 3973 40 South Alcaniz St.
> (850)202 4416   office Pensacola                            (850)202
> 4440   fax FL 32502                              (850)291 0667
> mobile phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>
>
>
>
>
>
>

-- 
Antoine Zimmermann
ISCOD / LSTI - Institut Henri Fayol
École Nationale Supérieure des Mines de Saint-Étienne
158 cours Fauriel
42023 Saint-Étienne Cedex 2
France
Tél:+33(0)4 77 42 66 03
Fax:+33(0)4 77 42 66 66
http://zimmer.aprilfoolsreview.com/
Received on Thursday, 14 March 2013 17:46:35 UTC