Re: datatypes and MT (#rdfms-graph)

>Pat Hayes wrote:
>>
>>  [...]
>>  >Perhaps with a small change
>>  >of emphasis/style it will look more like a graph:
>>  >
>>  >       Node = symbol U string U bnode
>>  >       Edge = Node x Node x Node
>>  >       Graph = 2^Edge
>>  >
>>  >where symbol is the set of URIs w/fragids; string
>>  >is the set of unicode character sequences; bnode is
>>  >a set disjoint from those two sets, and 2^X
>>  >is notation for "sets of X's".
>>
>>  OK, but you ought to distinguish the nodes from their labels.
>
>Why? I don't have any labels.

Ah, I see, the labels ARE the nodes. That certainly keeps the 
tidiness issues tidy.

>  > Blank
>>  nodes don't have labels, after all.
>
>*None* of the nodes have labels, by this reckoning.

Well, they do in a sense; they are their own labels, as it were.

>
>[...]
>
>>  >Er... well, I wasn't there when the decision was made, and
>>  >I don't really see why it was made. It seems like a case
>>  >of making something sufficiently fuzzy that nobody can
>>  >disagree with it.
>>
>>  Its not fuzzy; the graph syntax is perfectly well-defined. The great
>>  utility of the graph syntax is that it eliminates the need for
>>  scoping existential variables, because the thing corresponding to an
>>  existential variable is a blank node, and every blank node is unique;
>
>How do you mean? Do you mean something like "we assume, without
>loss of generality, that no two graphs share a blank node"?

No, I mean that the syntactic purpose of existential variables is to 
mark all the 'places' in an expression that are bound by the 
quantifier; and that this role is unnecessary in the graph syntax, 
since there is only one 'place' in the graph, and it is the blank 
node.

>
>If so, that can apply equally well to the bnodes I'm talking about, no?

It could, yes. The particular syntactic labels on those nodes now 
play NO role at all, however, either syntactic or semantic. The only 
possibly utility they have in the graph is to confuse the issue when 
two graphs are merged (since one must merge nodes with the same 
uriref, but one must NOT merge bNodes from different graphs, so the 
bNode names will need to be standardized apart in a merge.)

>
>>  the question of whether two blank nodes are the 'same' or 'different'
>>  is settled in the syntax itself. This, in the graph syntax there are
>>  no bound variables, or local names, or anything at all with a local
>>  scope.
>
>phpht. How are blank nodes not bound variables?

They don't have any lexical form; they are (literally) blank. The 
question of whether they are 'local' or 'global' names does not even 
arise; they are not names at all.

>
>>  This makes the graph syntax much easier for many folk to
>>  understand, apparently, folk for who the notion of a local name
>>  causes a lot of mental grief (see the mailing archives for evidence,
>>  or recall the best part of a day used up at the F2F arguing over
>>  this);
>
>I don't recall. Again: I wasn't there for the 2nd day.

I was referring to the first day. We got it licked in the morning of 
the second day.

>
>>  and it certainly makes the definitions of things like graph
>>  merging much easier to state, since one doesn't need to get into
>>  issues like renaming, standardizing names apart, etc. , etc.
>
>No? Why not? How is it that you conclude that bnodes
>in different graphs are different? I don't see it stated
>in the model theory.

Its in the section on inference. Well, first of all, nodes in 
different graphs just ARE different. The question is whether it is 
kosher to merge nodes when graphs are merged. Simple rule: merge 
urirefs, don't merge anything else.

>
>>  In fact
>>  you don't need to do *anything* to blank nodes in a merge.
>
>Your blank nodes and my bnodes are the same, no?
>Any magic that applies to your blank nodes applies to
>my bnodes equally well, no?

No, because yours are identified by their syntactic form, but must be 
understood to only have a local scope; so you have to worry about 
name clashes when graphs are merged. I don't have to even consider 
that; there are no local names. The things that would be local names 
are BLANK. They DONT HAV E a name.

>My bnodes are not (necessarily) character sequences or names
>or anything like that; bnode is just a set that's disjoint
>from symbol and string.

??? Oh, then by 'bnode' you don't mean bNode as in nTriples??

I must not be following you, then. What does it mean to allow the 
same thing as an edge and as a node, other than to say that they have 
the same label? For example, consider the following thingie, how 
would you draw it without using some kind of lexical form to indicate 
what was the same as what?

aaa bbb ccc .
bbb bbb ddd .
aaa eee bbb .


>The only difference between the 25 Sep graph syntax
>and what I'm proposing is that I'm cutting out the
>indirection from URI-symbols to graph nodes:
>it's one thing for folks to get used to the way
>URIs denote things; it's another thing altogether
>to say that URIs label nodes and nodes denote things.

Oh, I SEE. Well, why didn't you say so? That is trivial; if you 
prefer to say it that way, we can. (I thought that is how it was 
phrased in the 25 Sept version, in fact.) I put the nodes into the 
current editorial draft in order to handle the range-datatyping 
schemes, but I've put that on hold, for obvious reasons.

>A useless level of indirection, I suggest.
>
>
>>  As soon as
>>  we have any kind of locally scoped names in the graph,
>
>There's only one scope per graph/document in what I'm proposing,
>just like the 25 Sep model theory.

But we have to be careful when we talk of merging graphs, if there 
are things with a local scope identified by their lexical form.

>
>>  this critical
>>  advantage of the graph syntax is lost; and then I think there would
>>  be no advantage to be gained from using the graph syntax, and we
>>  would be better advised to revert to a lexical syntax of some kind,
>>  like Ntriples.
>>
>>  >
>>  >>  It certainly is a rejection, in effect, of the
>>  >>  *reasons* why that decision was made, viz. to get rid of bound
>>  >>  variables (local names, anonymous things that had names anyway,
>>  >>  skolems, whatever you want to call them) from the primary syntax. .
>>  >
>>  >Get rid of bound variables? What version of the model
>>  >theory was rid of bound variables?
>>
>>  Every version. There are no bound variables in the graph syntax.
>
>Again: in what way are blank nodes not bound variables?

Again: by not having a syntax.

>
>>  >certainly
>>  >not the version published 25 September 2001:
>>  >
>>  >"This effectively treats all unlabeled nodes as existentially
>>  >         quantified in the RDF graph in which they occur."
>>  >
>>  >       -- http://www.w3.org/TR/2001/WD-rdf-mt-20010925/
>>
>>  It *treats* them as existentially quantified, in the sense that a
>>  faithful translation of the graph syntax back into logic would map
>>  blank nodes (or bNode IDs) into existentially bound variables; but
>>  there aren't any such variables in the graph itself,
>
>again: how are blank nodes not variables?
>
>>  so questions
>>  about what it means for a name to be 'local' do not arise;
>
>sure they do; they're answered ala "each RDF formula/document
>has exactly one scope."
>
>>  issues of
>>  accidental clashes of local names don't arise;
>
>I don't see why not. Again: what magic guarantees that
>blank nodes don't collide when two graphs are merged?

This is like asking why they don't fall in love. What definition or 
process would constitute 'colliding' ? Each graph is a set of nodes 
and arcs, and the (simple) merge is the union of those sets. Now, of 
course, if you merge a graph with *itself*, say, or more generally if 
the two graphs being merged already overlap in some way, then the 
same blank node might arise twice; but it would be the SAME node, 
literally. Two  distinct blank nodes never become not distinct during 
a graph merge.

>
>>  and issues of
>>  determining and recording scopes don't arise.
>
>They do arise, and they are anwered; again: there's
>one scope per RDF graph/document/forumula.

But in the case of a merging operation, there are two documents 
before the merge and one, or maybe three, afterwards. Somehow we have 
to specify how to construct that larger scope.

>
>>  All labels in every
>>  graph are global in scope (urirefs and literals) and what would be
>  > the 'local' names simply aren't there. Unlabeled nodes don't have
>>  labels, so they don't need a scope.
>
>Somehow the fact that these blank node variables don't take
>the form of symbols/character-sequences means they don't need
>scope?

Right. The only point of scope is to indicate the part of a piece of 
syntax within which, when two different tokens of a symbol occur, 
they mean the same thing. If you use the symbol outside the scope it 
might mean something different, or not mean anything. The point of 
the blank nodes is that there is no need for any such scoping 
mechanism, since the role played by various tokens of the same symbol 
is here played by using - linking to - the node directly. The 
symbol/token distinction isn't needed, and no scoping machinery is 
needed to keep track of where two tokens have the same meaning.

>
>>  >I do think I'm missing the point.
>>
>>  Its really only a technical point about a syntactic style, but I
>>  think it is a useful feature of current RDF that it would be a pity
>>  to abandon without very good reasons. If we do abandon it, the MT and
>>  all the stuff on reasoning and entailment will suddenly get more
>>  complicated both to state and to implement.
>
>I don't see how the model theory would get any more complicated
>if you went with the syntax I'm proposing. I know it would
>get simpler because there would be no indirection from URIs
>to nodes before the interpretation.

Well, the difference between writing  I(E) and I(N) hardly seems like 
a big deal to me.

>Either there's something in the model theory that I don't
>see, or there's something in your mind that you haven't
>written into the model theory.

The MT itself wouldn't get more complicated, true. The statements of 
the entailment relations would, however: they would have to talk 
explicitly of standardizing names apart.

Pat


-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes

Received on Thursday, 15 November 2001 16:41:35 UTC