Re: RDF priorities

>Per telecon call,
>
>My view (well known, I suspect ;-) is that the core "model and 
>abstract syntax" are the most important parts of RDF to get nailed 
>down.
>
>In my view, every other part of RDF is critically dependent on this, 
>so we need to get it nailed down, and solidly.
>
>By "solidly", I mean that we have clearly defined what can and 
>cannot be understood about the meaning of an RDF expression.  This 
>would preferably be underwritten a formalism of some kind, but the 
>document will also need clear prose for developers who don't/won't 
>delve into too much formal mathematics.

I agree wholeheartedly. I would add a few more things that I still 
find confusing.

1. What exactly IS an RDF expression? (Is it a set of triples forming 
an abstract graph, or something textual, ie a parsing of a character 
stream, or what? I know these can be translated into one another and 
so on, but I think it would be good to get a single clear choice as 
being the definitive notion, and then nail it down as precisely as 
possible. I had formed the impression that sets of triples in an 
abstract graph were the definitive syntax for RDF, but much of the 
recent containers discussion has referred to lexical orderings, which 
are meaningless in an abstract graph, leaving me puzzled.)

2. Is RDF seen primarily as an assertional language (which expresses 
propositions about resources), or as a programming or processing 
language? I am sorry if this question seems exasperating, especially 
to the logic programmers among us, but it has a direct bearing on the 
appropriate semantic style one should adopt. Assertional languages 
typically have a semantics which is deliberately 'catholic' in what 
it allows into the universe of discourse, while programming language 
semantics are usually based on recursively closed universes of data 
objects that can be computed. We can try to do both at once, but that 
is a much trickier technical task and life would be easier if we 
could avoid it. I have been assuming that RDF is supposed to be a 
general-purpose descriptive framework rather than a kind of 
implementation language, but this is based more in general 
impressions than on any firm statement of purpose.
(Example of where the difference matters is containers. Does this:
>_:genid <rdf:type> <rdf:Bag>.
>_:genid <rdf:_1>   "1" .
>_:genid <rdf:_10>  "10" .
>_:genid <rdf:_2>  "11" .
DESCRIBE a bag, or should we think of it as (part of) an 
IMPLEMENTATION of a bag, ie a data structure encoded in triples? If 
it is a description (as I have been assuming) then it is natural to 
allow partial descriptions, and to regard this as referring to a bag 
with at least 10 members, 7 of which are unspecified. If it is an 
implementation, then it clearly describes a bag with three elements, 
using somewhat idiosyncratic labellings to distinguish them.  If we 
add the triple
_:genid <rdf:_6>   "6" .
then the description view would say that we have learned something 
new about the bag, while the implementation view would be that the 
bag now has 4 members.  Allowing item renumbering is arguably 
reasonable in the implementation view, clearly invalid in the 
description view. And so on. I think many issues would be a lot 
clearer if we were clear about this one.)

3. There seems to be quite a lot of undocumented 'RDF assumptions' 
that might be said to be guiding principles. I would like to see as 
many of these as possible made clear and explicit. Here are a few I 
think I have met, expressed in my own words (and thefore maybe not 
right, so I welcome correction):
a. Any collection of well-formed triples is wellformed; there is no 
such thing as a syntactically illegal set of well-formed triples.
b. Adding a triple to a set of triples cannot change the meaning of 
the triples in the set.
c. One never knows when one has 'all' of a set of triples, so any 
convention which can be broken by adding a new triple is illegal (? 
or deprecated?).
d. There is no such thing as a single context set of a triple; in 
other words, putting a triple into a different set cannot change its 
meaning.

What all these add up to, by the way, is a set of practices which 
effectively make it impossible to use RDF triple graphs to encode 
pointer structures in the LISP tradition.

>A part of this will be deciding about the role of "reification", or 
>maybe some replacement idea.  I happen to think this is a key area, 
>because the issues of provenance/trust/logic depend on being able to 
>use a statement without asserting it.  (I am skeptical that this can 
>be retro-fitted to a core RDF consisting of just asserted triples.)

This is probably going beyond the scope of this working group, but I 
would like to reiterate an opinion I have already voiced on 
RDF-logic. The previous paragraph seems to accept a view that seems 
to be built into RDF 1.0, but shouldn't be, since it is  profoundly 
and totally mistaken. This is the view that an expression  must be 
either asserted or described (reified), so that anything that is not 
asserted must be reified. This view is so utterly wrong that I find 
it hard to understand how it ever got built into RDF. Every logic 
ever invented, from Aristotelian logic forwards, has used 
propositional expressions in ways other than asserting them, until 
RDF.  RDF, uniquely in the history of human language design, provides 
no way to use an expression other than to assert it. In effect, it 
has no propositional syntax. Reification does not provide a way to 
add one.

To see why this assumption is wrong, consider any simple 
propositional assertion which is more complicated than an atomic 
fact, eg say the negation of an atomic fact. If someone asserts (not 
P), they obviously do not assert P. However, they also do not 
*mention* P or *describe* P; they use the expression P in *exactly 
the same way in the same sense* as someone who asserts P simpliciter. 
Both asserting and denying are about the truth-value of the 
expression.

There are two different contrasts one needs to keep distinct: between 
using and mentioning an expression, and between an assertion and 
other uses of an expression. Denying P , or claiming that P implies 
Q, or that one of P, Q or R is true, all *use* 'P' in the same way 
that asserting P itself uses the expression. In modern boolean logic, 
they are all done by asserting some propositional expressions 
(respectively (not P), (implies P Q) and (or P Q R).) None of these 
involve mentioning or reifying the (sub)expressions which are being 
used.

All the model theories that have been suggested for RDF either ignore 
reification or treat it, correctly, as indicating that a statement 
exists. But if I assert (not P), ie if I deny P, I am not saying that 
P exists, still less that P doesn't exist: I am saying the world is 
such as to make P false, just as when I assert P I am saying the word 
is such as to make it true. In both cases I USE the expression to say 
something about the world, not MENTION it to say something about it.

I am not arguing that reification should be abandoned or forbidden 
(though I would shed no tears if it were). It may have some utility. 
But it should not, and indeed I think cannot coherently, be used as a 
tool to encode non-assertional uses of subexpressions. In particular, 
it cannot be used to encode negation, disjunction, quantification or 
any of the rest of 'normal' logic, nor to encode modalities (such as 
indirect speech, as in transcribing "Ora said that Bill was Danish"; 
contrast direct quotation, as in "Ora said: 'Bill was Danish' " 
where Ora's actual words are cited, rather than the propositional 
content of what Ora said. The latter does indeed involve reification 
of a statement, indicated in English as in many technical languages 
by the use of quotation.) In brief: reification is a technical device 
of very limited utility; in particular, it cannot be used to 
'retrofit' most aspects of a richer assertional language into RDF.

>I think just about everything else can be retro-fitted without 
>fundamentally changing the nature of the core.  (But this is just an 
>opinion, not a certainty.)

I'm not sure what counts as 'everything else', but I fear that I am 
likely to disagree.:-)

Pat Hayes

---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes

Received on Monday, 25 June 2001 10:43:07 UTC