Re: Three alternative approaches for fixing the blank node scope problem.

Ivan,


Le 16/03/2013 13:35, Ivan Herman a écrit :
> I am not claiming I understand all the details, I will have to think
> about it. However, as a first reaction, I do not like #2. As you say
> for option (3), a Dataset (and a TriG) file defines a, ehem, single
> scope for bnodes; we have indeed agreed that graphs in a dataset may
> share bnodes (and I remember when this first came up I could not
> really match this with the intuition I had based on the 2004
> Semantics).

The definition of RDF Dataset has no relationship with RDF Semantics. An 
RDF dataset is just a structure. We can define whatever structure we 
want, it does not have any relevance to RDF semantics.
For instance, I can define a concept of "programmed RDF graphs", which 
are pairs (R,J) where R is an RDF graph and J is a Java programme that 
can parse R into a Jena model. By this definition, nothing constrains 
two programmed RDF graph to share or not share bnodes.

RDF datasets in the current RDF 1.1 Concepts draft is just RDF dataset 
in normative SPARQL. As in SPARQL, there is no constrain on the 
separation of bnodes between "named" graphs.

What we agreed on is that we will not add such a constrain, therefore 
"named" graphs of the same RDF dataset may or may not share bnodes. We 
also said that, if they do share bnodes, then the nodes are identified 
in concrete syntaxes with the same identifier. This again has no 
relevance to RDF semantics. It may have some relevance to certain 
*Dataset semantics*, but that's a different issue that we now do not 
need to consider for REC.


> But if we accepted your option (2), this would mean
> something like having a containing graph containing all graphs in the
> dataset;

Well, such a graph exists by definition, unless we agree that some sets 
of RDF triples are not graphs. The idea behind (2) is simply to be able 
to group triples that we consider being somehow tied by the way they 
concretely materialise their bnodes.


> I believe that would become very very confusing. So I would
> prefer not to go down that route.
>
> At first glance, my preference is (3) but I should go back and (try
> to) understand Antoine's objections...

Pat has indicated in one of his emails 
(http://lists.w3.org/Archives/Public/public-rdf-wg/2013Mar/0125.html) 
that there is just one way of interpreting his (now gone) proposal in 
RDF Semantics:

"""
On Mar 13, 2013, at 11:32 AM, Antoine Zimmermann wrote:
[...]
 > 1) there is a mapping s from the set of all blank nodes to the set of 
scopes (and what's a scope is not specified beyond that there is a set 
of them). So, given a bnode b, I can say what's its scope by s(b).

Yes. That is another way to express the bscope idea, above. (b in c) iff 
s(b)=c
[...]
 > There are probably other ways to interpret the current text.

Have you got the newest version? I find it hard to see how this text can 
be understood in any other than the intended way.
"""

So, it seems that Pat's intention, at least at the time he wrote this 
email, was to propose something like (1) below.
However, (1) is fundamentally flawed because all complete graphs are 
{xsd:integer}-inconsistent:

PROOF:
Let us consider a blank node b in a scope s (i.e., sc(b) = s). The 
triple (b, <p>, "a"^^xsd:integer) is in the scope s. This triple must be 
in the complete graph that contains the bnode b, by definition of 
complete graph. So the complete graph having scope s is 
{xsd:integer}-inconsistent. As the choice of bnode is arbitrary, this is 
the case for all complete graphs.
QED.

Now, I have a proposal written that tries to reconciliate what I think 
is right for bscope with much of what Pat has proposed. I hope it will help.

I will send it by email, but also put it in a wiki page such that it's 
easier to check for people who do not follow the thread.



AZ

>
> Thanks
>
> Ivan
>
>
> On Mar 15, 2013, at 23:39 , Pat Hayes<phayes@ihmc.us>  wrote:
>
>> As this debate has gotten very intense and active, and it is just
>> barely possible that some folk might not be following every little
>> detail, and there have now been several proposals, let me summarize
>> the proposals here. These are *alternatives*, note. We don't need
>> to do more than one of them. They are all formally equivalent.
>> They all involve defining at least one new notion, indicated
>> **thus** which probably should be in Concepts. Please keep in mind
>> the distinction between blank nodes and bnodeIDs when reading
>> this.
>>
>> 1. bscopes. (Idea developed in my ISWC 2009 talk, the simplest and
>> most 'abstract' of the three.)
>>
>> We introduce the idea of a **bscope**, and a relation **in**
>> between blank nodes and bscopes. Every blank node is in exactly one
>> bscope. (Or, we define a **function** sc( ) from blank nodes to
>> bscopes.) A subgraph is *complete* when, if it contains a blank
>> node, it contains all the triples which contain that blank node.
>>
>> An RDF graph is a set of triples **such that every blank node in
>> the set is in a single bscope**. Every set of RDF triples is
>> graph-equivalent to an RDF graph as defined here, so this is no
>> limitation upon how RDF graphs can be formed.
>>
>> Surface RDF syntaxes are required to define how their bnodeIDs map
>> into bscopes, ie exactly when two occurrences of a bnodeID identify
>> a single blank node, and exactly when two identified blank nodes
>> are in the same bscope.
>>
>> The merge of a set of graphs is a graph comprising the union of all
>> the triples in a set of graph-equivalent graphs, with blank nodes
>> from a new bscope. The merge lemma holds for sets of complete
>> graphs.
>>
>> -----------
>>
>> 2. containing graphs. (Idea in recent email.) This does not mention
>> scopes explictly.
>>
>> An RDF graph is a set of triples.  Some RDF graphs are designated
>> to be **containing graphs**. Two different containing graphs cannot
>> contain (triples which contain) the same blank node. (But a
>> subgraph of a containing graph may share a blank node with its
>> containing graph. So if two graphs share a blank node, they must be
>> both subgraphs of the same containing graph.)  Intuitively, a
>> containing graph is all the triples which contain a given
>> collection of blank nodes. *Complete* graphs are defined as above.
>>
>> **Every RDF graph is a subgraph of a unique containing graph**
>> (which might be the graph itself, of course.) Any set of triples,
>> even ones which cross containing-graph boundaries, is
>> graph-equivalent to a containing graph (just provide brand new
>> blank nodes, not used anywhere else) so, again, this is not a
>> limitation on how RDF graphs can be formed.
>>
>> Surface RDF syntaxes are required to define how they specify
>> containing graphs. For example, with our current decisions, the
>> containing graph of a dataset is the union of the graphs in the
>> dataset.
>>
>> The merge of a set of graphs is a containing graph comprising the
>> union of all the triples in a set of graph-equivalent graphs, using
>> new blank nodes. (Basically the union, but we allow blank nodes to
>> be re-mapped in a 1:1 fashion in order to fit into a new containing
>> graph.) The merge lemma holds for sets of complete graphs.
>>
>> ------------
>>
>> 3. bnodeID syntactic scopes.  (This is in the current draft of
>> Semantics, text reproduced here.  Scoped graphs as described here
>> are the same as containing graphs as described in (2), and bscopes
>> in (1) are what all the blank nodes identified by bnodeIDs in a
>> single bnodeID scope are in. They all say the same thing, in
>> different ways.)
>>
>> Blank nodes may be identified in a surface (document) syntax for
>> RDF using blank node identifiers. Each surface syntax must specify
>> an unambiguous notion of the **scope** of such identifiers, such
>> that any graph defined by this syntax will be inside a single
>> scope. Two graphs not in the same scope do not share any blank
>> nodes. Each combination of a blank node identifier and a
>> surrounding scope is understood to define a unique blank node,
>> local to the graphs described by the surface syntax. The same blank
>> node identifier used in different scopes identifies a different
>> blank node in each scope in which it occurs.
>>
>> Scope boundaries are defined by the surface syntax used to encode
>> RDF. For example, in RDF/XML [RDF-SYNTAX-GRAMMAR] and NTriples
>> [RDF-TESTCASES], the scope is defined by the document. In TriG, a
>> syntax for RDF datastores, the scope is the entire datastore.
>>
>> The set of all triples in a given scope is called a **scoped
>> graph**. **Every RDF graph described by a surface syntax for RDF
>> must be a subgraph of a scoped graph**.
>>
>> An RDF graph is *complete* when, for every blank node in the graph,
>> the graph contains all triples in the scoped graph which contain
>> that blank node.
>>
>> Merging is taking the union.
>>
>> ------------
>>
>> I hope this helps the WG in its deliberations. I suggest that the
>> second version might be the least painful one for people to
>> swallow, as it introduces the least extra formal machinery, and it
>> allows neat phrasings such as "a subgraph considered as a
>> containing graph" to indicate how graph boundaries are being
>> treated in examples.
>>
>>
>> ------------------------------------------------------------ IHMC
>> (850)434 8903 or (650)494 3973 40 South Alcaniz St.
>> (850)202 4416   office Pensacola
>> (850)202 4440   fax FL 32502                              (850)291
>> 0667   mobile phayesAT-SIGNihmc.us
>> http://www.ihmc.us/users/phayes
>>
>>
>>
>>
>>
>>
>
>
> ---- Ivan Herman, W3C Semantic Web Activity Lead Home:
> http://www.w3.org/People/Ivan/ mobile: +31-641044153 FOAF:
> http://www.ivan-herman.net/foaf.rdf
>
>
>
>
>
>
>

-- 
Antoine Zimmermann
ISCOD / LSTI - Institut Henri Fayol
École Nationale Supérieure des Mines de Saint-Étienne
158 cours Fauriel
42023 Saint-Étienne Cedex 2
France
Tél:+33(0)4 77 42 66 03
Fax:+33(0)4 77 42 66 66
http://zimmer.aprilfoolsreview.com/

Received on Tuesday, 19 March 2013 16:58:44 UTC