RE: motivation for bNodes/existentials in RDF; note for parsers from Pat Hayes on 2002-04-05 (w3c-rdfcore-wg@w3.org from April 2002)

From: Pat Hayes <phayes@ai.uwf.edu>
Date: Fri, 5 Apr 2002 17:45:42 -0600
To: "Massimo Marchiori" <massimo@w3.org>
Cc: <w3c-rdfcore-wg@w3.org>, "Lynn AndreaStein" <las@olin.edu>, "Dan Connolly" <connolly@w3.org>, "Lynn AndreaStein" <las@olin.edu>, "Jan Grant" <Jan.Grant@bristol.ac.uk>
Message-Id: <p05101512b8d3dc988169@[65.217.30.94]>
>Pat, thanks for your reply, which in fact helped to better
>understand what what your assumptions are.

I wish I could say the same about yours :-)

>  > ><disclaimer>
>>  >In these kind of sentences I always have to argue with Dan, as
>>  >he always says so and I always reply in the usual way...
>>  >So, restating the above, the *current version of the
>>  >RDF Model theory* states that the interpretation of RDF
>>  >ought to be ... bla.
>>  >This is important to remember, as it's a fundamental design choice
>>  >that it's going to be decided, but it's not present in the
>>  >"normative RDF" (M&S) document.
>>  ></disclaimer>
>>
>>  Well, the normative M&S seems pretty clear about the intended status
>>  of anonymous nodes. I can see why, for some purposes, it would be
>>  convenient if anonymous nodes had hidden labels; but that
>>  interpretation is wishful thinking imposed on the M&S, which is about
>>  as unambiguous as it is possible to be without actually giving a
>>  formal model theory.
>In fact, I profoundly disagree here. If there's something where M&S is very
>vague is just anonymous nodes. Even, reading the spec carefully,
>one will see that, in fact, this vagueness can be somehow justified,
>as anonymous nodes are there seen just as accessories, and not as a
>fundamental component of M&S: a facility to avoid having to assign
>names, but just a facility.

I honestly do not see this interpretation justified by anything in 
the wording of the M&S. Anonymous nodes seem to play a central role 
in the document. They occur in many of the examples given, and are 
central to the accounts of containers and reification. And if there 
is anything that is surely clear about them, it is that they do not 
have labels; they really are *blank*.

>Then, we can debate at length on how the
>spec should have been clearer on this (and where the architectural
>pitfalls are). We can even decide M&S was not smart and that we need to
>redefine and introduce first-class anonymous nodes (not a bad idea), but it
>should be clear the distinction on what *we* are saying, and what's written
>in the normative spec

I agree, but I think that *your* interpretation of the M&S is quite 
idiosyncratic and unusual; I have never heard it before in any of the 
ongoing discussions, in any case. I don't think that allowing blank 
nodes in the RDF syntax can possibly, under even the wildest 
interpretation, be called 'redefining' RDF or 'introducing' anything 
to it. (I do not know what you mean by 'first-class', so I cannot 
speak to the implied criticism that we have been guilty of class 
elevation.)

>(the only normative thing we have so far).
>And above all, this has to be pointed out because of the charter
>implications
>it has...
>
>Anway, this is leading away from the other point of the discussion: the
>existential interpretation.
>Now, before we got lost here, just a note: no fundamental objections
>to the existential interpretation per se, as it's in the MT, and the MT
>is (citing the abstract) "a model-theoretic semantics for RDF and RDFS",
>(so, "a", not "the"). But, as this point has never well been clarified so
>far, if the "a" changes into a "the",

I presume that the intent of the WG is that once completed and in its 
final form, the MT will be considered definitive, not merely an 
option. It is the clearest statement of the meaning of RDF(S).

>then the level of criticisms
>radically changes, and that's where more severe stop-over criticisms can
>occur.
>So, let's go on:
>
>>  Logically, skolemization is
>>  quite complicated. It isn't valid, for example, and it blocks several
>>  useful intuitive inferences. It would be impossible to express
>>  queries in RDF if it had no variables, for example.
>Yes, skolemization can be as complex as existential quantification. Even,
>mea culpa for using the word "skolemization" here. Sure, it's
>better to have variables to do RDF query (not impossible, btw).
>But all this is about different things: inference and RDF Query.
>RDF Core was not chartered to normatively do these two, and not chartered to
>do the "web logic".
>There's nothing like a "variable" in RDF.

Technically you are right, there are no variables in RDF. The blank 
nodes in an RDF graph are not variables, and one of the great 
features of the graph syntax is that it provides the expressive power 
of a quantifier without actually having a quantifier in the syntax. 
As the MT document is careful to emphasize, the node IDs used in 
N-triples are not themselves part of the RDF graph syntax.

>The moment you put in variables,
>existential quantification, entailment, you're trying to do RDF 
>logic, which is a different wg (and, a different spec than M&S).

I don't agree that the interpretation of blank nodes is 'trying to do 
logic'. It is merely a clarification of the most natural 
interpretation of the apparent intent of the RDF M&S. And to 
emphasize, we have not 'put' variables into RDF; we have not altered 
RDF syntax at all, in fact. We have simply given a mathematical - ie, 
precise - account of the meanings of the syntactic constructs that 
are already in the language.

>What is needed is just the minimum clarifications (and patches) to M&S model
>that complete the
>formalization vagueness.

Quite.

>Doing (normatively) the RDF Logic requires another
>wg, or a
>different charter.

I am not sure what you mean by 'doing the RDF Logic'. Giving a 
precise semantics to an assertional language can always be described 
as 'doing a logic' in some sense. We have been at pains to avoid 
extending the expressive power of RDF in any significant way, even 
when it was excruciatingly obvious that it badly needs to be 
extended. The section on entailment in the MT document was put there 
largely for pedagogic reasons, to explain how the semantic notions 
'cash out' as relationships which can be used to justify inferences 
of various kinds. In the next draft I will insert more cautionary 
wording to try to emphasize that this is not intended to be a 
description of a logic in any functional sense.

>So, now the higher level of criticism can be better understood.
>If MT is just a proposal, fine (even more: good! as, it helps to
>provide a possible good starting point for a next RDF Logic/Query wg).
>If on the other hand, it aims to define now the normative RDF Logic,
>then I think it's truly beyond scope.

Im not sure  what you mean by 'the normative RDF logic'. Any 
semantics for an assertional language will define a notion of 
entailment and hence a 'logic' in some sense. The MT does not define 
any inference rules or proof theory for RDF, but it does of course 
define a notion of semantic entailment. It could hardly fail to, by 
its very nature. If this much clarity offends you, Im not quite sure 
what you expected the WG to produce by way of clarification of the 
'model'.

>Its formal status is unclear now, so better clarify it.
>
>-M
>
>
>ps
>Replies to the other more precise points that have been asked, become
>by the above discussion just secondary (and trivial), as they
>deal with the "existential vs skolem" within a full RDF logic/query
>context, which is another level of discussion (interesting, but another
>one).
>However, they are included for completeness:
>
>>  >That is, what are the pro's and con's that favour the existential
>>  >approach vs the skolem one?
>>  >AFAIK the second one has been so far the natural choice (the
>>  >"understood standard" if you want ;), for some good reasons.
>>
>>  Which are? (I know some, but I wonder if you have others.)
>Just one (others too, but not pertinent to this discussion): doing
>existential
>interpretation, you're introducing variables and more powerful logic.

Sorry, that is a misapprehension. See above. And in any case, it does 
not address the question.

>Not
>what M&S does.
>
>>  >>  At the first WG F2F we had a long (and, i think, productive) argument*
>>  >>  about this. Sergei produced a good set of pros and cons; my arguments
>  > >>  for this are ...
>>  >>
>>  >>  - supports "non-assertional" mode, ie, RDF querying by turning around
>>  >>    the "X entails what?" into "what entails X?"
>>  >>
>>  >>  - aesthetic reasons, and those of transparency. When I write an
>>  >>    assertion with a blank node, I intend it to mean "there exists...".
>>  >>
>>  >>  - DanC also claimed that skolemisation was too much of a general
>>  >>    impediment to getting software written :-) I think he may have
>>  >>    been dramatising for, well, dramatic effect, but I've some sympathy
>>  >>    with this POV. In other words, supporting anonymous nodes requires
>>  >>    some API fiddling, but is not necessarily a "simpler mechanism".
>>  >
>>  >Thanks for the initial reply, Jan.
>>  >I don't want to start a complex debate without first having seen
>>  a complete
>>  >reply, just noticing that this is such a fundamental
>>  architectural decision
>>  >that a complete and careful pro/con analysis is due. The above
>>  three "pro"
>>  >reasons
>>  >all have good counterarguments,
>>
>>  I'd be interested in hearing them.
>
>"non-assertional": this doesn't makes a difference for M&S, as this
>functionality goes into the RDF Query, not in RDF.

But an analysis of the query language will presumably make some 
reference to the meaning of the language being queried, and the 
relationship between the assertions used to answer the query and the 
answer itself. While I agree that the actual design of a query 
language is clearly out of scope for the WG, to look ahead a little 
to forsee the likely needs is not out of scope.

>"aesthetic": sure, but this doesn't imply you have to impose a normative
>existential quantification already at the M&S level.
>
>"general impediment": this refers to the semweb/RDF logic and query, not to
>the M&S
>(in fact, without variables and quantifications, it's much easier to write
>software ;)

That depends entirely on what you want the software to be able to do. 
In any case, you are free to write code which treats blank nodes as 
having labels. It will draw correct conclusions, but not so many of 
them.

>  > For me, the fundamental motivation for not skolemizing blank nodes
>>  concerns entailment. Consider an RDF graph G1 and another graph G2
>>  got from G1 by erasing some of the node labels, ie replacing urirefs
>>  with blank nodes. Does G1 entail G2? Seems to me that the answer has
>>  to be 'yes' in order to capture the intended meaning of blank nodes
>>  as described in the M&S. If blank nodes are hidden skolemizations,
>>  however, the answer is 'no'. So skolemized graphs do not adequately
>>  support the proper RDF entailment conditions.
>You're talking about entailment and RDF logic here, not M&S.

I fail to see how an language with a precise semantics can fail to 
have *some* entailment conditions. Whether you classify this as 
'logic' is irrelevant, because meaningless.

>Nobody
>prevents you from introducing existentials at a higher level (and,
>within the RDF logic wg...).
>
>>  It would be fine for blank nodes to have some kind of
>>  implementation-dependent hidden labels - in effect, that is what the
>>  bnode labels in Ntriples are - but the critical point is that these
>>  labels are not treated like urirefs; they have no global scope, and
>>  are not meaningful outside the graph, and are not treated
>>  semantically as names.
>Yes, and that's what needs to be clarified in the M&S. But no need to
>introduce existentials and variables to do so...

This is a pointless distinction. If you have locally scoped names 
which can be substituted without changes of meaning in the enclosing 
expressions, then you have already introduced existentials and 
variables. Whether you call them by those names, or what notation you 
use, are superficial matters of nomenclature.

>
>>  RDF could be redefined without blank nodes, of course, but it would
>>  be a different, and much weaker, language. In effect, it would be a
>>  simple positive propositional logic, with no quantification.
>>  Skolemized nodes are not blank, so the whole concept of anonymous
>>  nodes would be eliminated from the language, rendering it simpler in
>>  both its syntax and its semantics.
>Again, here the discussion is on how to better do a powerful RDF logic.
>And as it's not in M&S, the architectural possibility to drop blank nodes
>had in fact to be considered

Well, we could, and indeed did, consider dropping blank nodes 
altogether. But that would have been a major change to the language, 
and so seemed (and still seems to me) to be outside our charter.

>  (..!)
>
>>  I agree that this is a fundamental
>>  architectural decision, but it seems to me that the M&S has already
>>  clearly made the decision: the language does contain blank nodes.
>>  Given that decision, skolemization is not an option. The milestone
>>  decision was taken long ago, and all we are doing is stating it more
>>  precisely.
>As said at the beginning, this is not true. M&S introduces blank nodes
>as a facility.

You keep saying that, but I don't see any evidence for that 
interpretation in the M&S itself or in what the original authors have 
told us about their intentions.

>Then, it can be convenient to give them first-class
>status in the model. But that's all, the moment you go beyond with
>variables,
>it's no M&S any more, it's not RDF core any more.

All we have done is to recognize that the blank nodes, as you say, 
have "first-class status". Seems to me that is about all that they 
can have: Im not sure what other kind of status you have in mind. We 
have not gone 'beyond' this is any way. We have not introduced 
variables or quantifiers into RDF, and have not extended the syntax 
of RDF.  So I don't quite see what it is that you are getting so 
worked up about.

Pat Hayes



-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Friday, 5 April 2002 18:45:47 UTC