Re: A different take on b-scopes (ISSUE-107) from Antoine Zimmermann on 2012-11-23 (public-rdf-wg@w3.org from November 2012)

From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Date: Fri, 23 Nov 2012 08:19:33 +0100
To: Richard Cyganiak <richard@cyganiak.de>
CC: RDF Working Group WG <public-rdf-wg@w3.org>
Message-ID: <50AF2385.2010901@emse.fr>
Richard,


There are flaws in your proposal and you do not want to admit it.
Let us go through it carefully and see what are the issues. I put a 
proposal at the end that tries to keep as much of your text as I could.

"""
A scope has an associated 1:1 mapping (bijection) between the set of all 
blank node identifiers and a set of blank nodes.
"""

So, given a scope S, there is a mapping m(S) from all bnode IDs (UNICODE 
strings or a subset of it) to a set b(S). Since it is 1 to 1, we can 
also talk about the inverse mapping M(S) from b(S) to UNICODE.

"""
Scopes are subject to the following rules:
  - The sets of blank nodes in any two scopes are disjoint.
"""

Is "a bnode in a scope" an element of b(S)? Let us assume it is, 
otherwise it is ISSUE-b1.

"""
A fresh blank node is any blank node that is not yet used within its scope.
"""

So, there are bnodes that are used in a scope. Let us call the set of 
used bnodes in S, u(S). Is u(S) a subset of b(S)? Probably, otherwise it 
is ISSUE-b2. A fresh bnode is not in u(S), ok. Must it be in b(S)? Can 
it be in, say b(S')? This is ISSUE-b3.
Moreover, "is not yet" suggests that the set is subject to change. So 
it's mutable, probably.

"""
An RDF graph is copied into a scope by replacing each blank node in the 
graph with a fresh blank node in the target scope.
"""

Let us consider graph G and let us say it is copied into S. What does it 
mean? It seems that you are somehow defining a new graph G', isomorphic 
to G but with fresh bnodes of S, so bnodes that are not in u(S). It 
looks like you are talking about "a copy of G to S", which would yield 
an RDF graph, but since you say "is copied into" it sounds more like a 
modification of a state. I don't know. This is ISSUE-b4.

"""
Occurrences of one blank node in multiple triples are all replaced with 
the same fresh blank node.
"""

Well, I don't know why the notion of occurrence suddenly appear. We were 
talking about RDF graphs, not concrete representations. This is ISSUE-b5.

"""
If none of the source's blank node identifiers are used in the target 
scope, copying into a scope can be achieved by simply re-using the same 
blank node identifiers in the new scope.
"""

So now, there are used bnode IDs as well. Let us call them uid(S). So I 
guess this means that uid(S) is the set of bnode IDs that are mapped to 
u(S) via m(S). Which supports the idea that u(S) is a subset of b(S).

There is also a notion of source's bnode IDs. But bnode IDs only exists 
in scopes, not in RDF graphs. So I guess this means that a graph is 
"copied" *from* a scope into another scope. But what the hell does 
"re-using a bnode ID" mean in this context?
I'll try to figure it out: let us write b(G) the set of bnodes in an RDF 
graph G. Let us assume b(G) is in u(S). Then the quoted sentence could 
be reformulated in (take a breath):

<< If uid(S) is disjoint with uid(S'), then the copy of G from S into S' 
is a graph G' such that the isomorphism from G to G' maps each bnode n 
of b(G) to a bnode n' of b(G') with M(S)(n) = M(S')(n'). >>

Phew! But this is wild guess so I'm not at sure I get it correctly. So 
let us call this ISSUE-b6.

"""
The merge of two RDF graphs is the result of copying both graphs into a 
target scope.
"""

So, this looks like we are kind of generalising the notion of copy to a 
pair of RDF graphs (and therefrom, to an arbitrary set of graphs).

"""
The result is a single graph
"""

This suggests that the notion of copying is in fact defining what *a 
copy* is.

"""
The result is a single graph where all blank nodes are in the same 
scope, and where any blank node identifiers that occurred in both input 
graphs have been replaced in order to avoid clashes.
"""

I really don't understand what this is saying at all. What is the result 
exactly? This is ISSUE-b7.

But anyway, "merge" shouldn't belong to this section. It is completely 
independent of the notion of scope and bnode ID.

Here is a concise definition:

<< A merge of two RDF graphs G and G' is the union of two RDF graphs H 
and H' such that H and H' do not share blank nodes, H is isomorphic to 
G, and H' is isomorphic to G'. >>

...and then specify that all merges of G and G' are isomorphic so that 
we can usually talk about "the" merge.

===========

Here is yet another proposal, with less formalism:

News assumption: I make everything immutable, just talk about sets and 
mappings.

"""
A /blank node identifier/ is a Unicode string that identifies a blank 
node within some local context, called a scope. A scope has:
  - an associated 1:1 mapping (bijection) between the set of all blank 
node identifiers and a set of blank nodes;
  - and a finite set of /used blank nodes/, associated with their used 
blank node identifiers.

Scopes are subject to the following rules:
  - the sets of blank nodes in the mappings of any two scopes are disjoint;
  - every RDF document forms its own scope;
  - scope boundaries outside of RDF documents (for example, in RDF 
stores) are implementation-dependent;
  - other specifications MAY impose additional rules, including 
constraints on the syntax of a scope's blank node identifiers.

A /fresh blank node/ is any blank node that is not used within its scope.

An RDF graph is said to /belong to a scope/ if its bnodes are in the set 
that the scope maps to.

A /copy/ of an RDF graph into a (target) scope is an RDF graph that can 
be obtained by replacing the blank nodes of the source graph by fresh 
blank nodes in the target scope.
"""

And we may add:

"""
A /concrete RDF graph/ is an RDF graph having its blank nodes identified 
by blank node identifiers in a known scope.

[[Note: copying a concrete RDF graph from its scope to another scope 
amounts to making a concrete RDF graph which contains unused 
identifiers. If the identifiers in the original concrete RDF graph are 
not used in the target scope, then the same identifiers can be used in 
the copy.]]

[[Note: <a href="definition-of-merge">Merging</a> can be understood as a 
copy operation, even though it is abstractly defined independently of 
scopes and blank node identifiers.]]
"""



Best,
AZ.


Le 22/11/2012 13:12, Richard Cyganiak a écrit :
> Hi Antoine,
>
> On 22 Nov 2012, at 09:28, Antoine Zimmermann wrote:
>> Yes, it's going in the right direction and I like it much better than
>> before. But still some issues: the proposal has some unsaid assumptions that makes it a bit sloppy.
>>
>> 1. A scope is mutable. Bnodes id can be added to it, thus the notion of fresh bnodes;
>
> No, a scope is not mutable. It's a bijection between *all* blank node identifiers and some set of blank nodes. “Using” a blank node stops it being fresh, but doesn't modify the scope.
>
> Making scopes mutable means that now you will have people who ask how to delete a blank node from a scope, and you need to put constraints to stop people from re-assigning a blank node identifier to a different blank node. Let's *please* not go there.
>
> (The reason you want mutability is because you don't like how “freshness” is defined. I know that the definition of “fresh” is mathematically sloppy, but it is perfectly comprehensible, therefore I object to making it more complicated just to please mathematical aesthetics. I like precision, but this is becoming formalism for the sake of formalism. I'm happy to change the definitions of “copy” and “merge” to something that is declarative and doesn't rely on “freshness” if anyone can propose wording that works.)
>
>> 2. A scope is associated to an RDF graph, thus the notion of copying a graph into a scope, and merging towards a scope.
>
> No, a scope is not associated to an RDF graph. The notions of copying and merging are really operations on sets of blank nodes, and not on graphs. It just so happens that the only sets of blank nodes that are ever interesting are those contained in a particular graph, hence we define the copy and merge of graphs, not the copy and merge of blank node sets. Scopes are associated with blank nodes, not graphs.
>
> Most crucially, any number of graphs can be formed from the blank nodes in any given scope. For example, given a graph G whose blank nodes are all in scope S, the blank nodes of any subgraph of G are supposed to be still in S, but they can't in your proposal because it's now a different graph, hence different scope, hence disjoint set of blank nodes. Another example is RDF datasets: A TriG document, being an RDF document, is a scope and may contain many graphs.
>
>> I had a hard time making sense of the two paragraphs before the note but here is a proposal. At some places it may be a bit too heavy in trying to be precise, so we can consider removing parts if accepted.
>
> Well, all the detail is there because others complained that it wasn't precise enough.
>
>> """
>> A /blank node identifier/ is a Unicode string that identifies a blank node within some local context, called a /scope/. A /scope/ is a mutable entity that comprises:
>> - a finite set of /blank node identifiers/;
>> - an RDF graph;
>> - a 1 to 1 mapping (bijection) between the set of identifiers and the set of blank nodes in the RDF graph.
>>
>> Scopes are subject to the following constraints:
>> - in any state of affairs, different scopes map their identifiers to disjoint sets of blank nodes;
>> - every RDF document forms its own scope, where the RDF graph of the scope is the one serialised in the document;
>> - scope boundaries outside of RDF documents (for example, in RDF stores) are implementation-dependent;
>> - other specifications MAY impose additional rules, including constraints on the syntax of a scope's blank node identifiers.
>>
>> If a scope maps a blank node identifier to a given blank node, the identifier is said to /identify/ the blank node. A blank node that is identified by a blank node identifier in a scope is said to /belong/ to the scope.
>>
>> A /fresh blank node/ is a blank node that does not belong to any scope.
>>
>> A /copy/ of a given RDF graph is an isomorphic RDF graph that only contains fresh blank nodes. An RDF graph is /copied into a scope/ by adding all the triples of a copy of the graph to the target scope's graph, and extending the mapping by introducing new identifiers mapped to the fresh nodes. If the given RDF graph belongs to a scope (its source), and none of the source's blank node identifiers are used in the target scope, copying into a scope can be achieved by simply re-using the same blank node identifiers in the new scope.
>>
>> The merge of two RDF graphs can be obtained by copying both graphs into a target empty scope. In this case, the merge will be the target scope's RDF graph after the copies.
>> """
>
> Thanks for taking the time to write this up. But I think it doesn't work, for the two reasons stated above: If you want mutability then you need to place constraints on it (and I doubt that you want mutability); and a blank node must be allowed to occur in any number of graphs (but only in one scope).
>
>> Remark: in RDF 2004, merge is a math operation, so it does not involve changes of state, copy, etc. It's also a "semantic" operation, in the sense that the merge of a set of graphs is the only RDF graph (up to isomorphism) that is simple-equivalent to the set of graphs.
>>
>> If we keep it this way in RDF 1.1, and I hope we do, then what concepts says about merge should not be presented as a definition but rather a way to *do* a merge. Thus, my words say "the merge can be obtained by etc."
>
> RDF 2004 actually *defines* merge by saying “it is obtained by”.
>
> The distinction you draw between “semantic” and “non-semantic” operations is spurious. If you want to draw such a distinction, it should be between operations that are defined with respect to an entailment regime, like entailment and equivalence and consistency. There is nothing particularly “semantic” about an operation or relationship that only holds in simple entailment.
>
> (If the B-Scopes proposal is adopted, then merge and union, if used appropriately as described in RDF 2004, are equivalent anyway. Per RDF Semantics, the use of the merge is appropriate only when graphs come from different sources, and per the B-Scopes proposal, they have disjoint sets of blank nodes in that case. Hence the merge *is* the union. So we might just as well define the merge as *being* the union of two graphs, with a note saying that if you want a single set of blank node identifiers to uniquely refer to them, which you usually want in practice, then you need to copy that union into some scope; and another note pointing out that this was all a bit more complicated back in 2004.)
>
> Best,
> Richard
>
>
>
>>
>>
>> AZ
>>
>> Le 22/11/2012 00:48, Richard Cyganiak a écrit :
>>> So here's a modified proposal. (The old one is still further down on
>>> the same page.)
>>> http://www.w3.org/2011/rdf-wg/wiki/User:Rcygania2/B-Scopes
>>>
>>> What this does:
>>>
>>> * Takes an old 2004-style definition of blank nodes * Adds a new
>>> subsection on “blank node identifiers and scopes” * Defines scopes
>>> more formally by saying that they have an associated “1:1 mapping
>>> (bijection) between blank node identifiers and blank nodes”
>>>
>>> The goal was to make scopes an add-on to the definition of blank
>>> nodes, rather than baking them right into the definition. I may be
>>> wrong but that seemed to be at the heart of both Antoine's and Andy's
>>> concerns.
>>>
>>> If this changes anyone's view of the whole thing (in a good or bad
>>> direction), then please comment.
>>>
>>> The new proposal keeps the following bit, which Antoine and Andy may
>>> also have objected to, but which for me is the key sentence to the
>>> whole endeavour:
>>>
>>> “The sets of blank nodes in any two scopes are disjoint.”
>>>
>>> If you think that this sentence shouldn't be there, then I'd really
>>> like to hear the case argued, because I don't understand the reason
>>> for this objection.
>>>
>>> Best, Richard
>>>
>>
>>
>> --
>> Antoine Zimmermann
>> ISCOD / LSTI - Institut Henri Fayol
>> École Nationale Supérieure des Mines de Saint-Étienne
>> 158 cours Fauriel
>> 42023 Saint-Étienne Cedex 2
>> France
>> Tél:+33(0)4 77 42 66 03
>> Fax:+33(0)4 77 42 66 66
>> http://zimmer.aprilfoolsreview.com/
>>
>
>
>


-- 
Antoine Zimmermann
ISCOD / LSTI - Institut Henri Fayol
École Nationale Supérieure des Mines de Saint-Étienne
158 cours Fauriel
42023 Saint-Étienne Cedex 2
France
Tél:+33(0)4 77 42 66 03
Fax:+33(0)4 77 42 66 66
http://zimmer.aprilfoolsreview.com/
Received on Friday, 23 November 2012 07:20:04 UTC