Re: RDF Semantics, non-lean RDF graphs, and redundancy of content

> ...
> Hope the above helps.
>
> Pat Hayes

Thank you for the clarification.

It seems thus fair to say that the notion of redundancy is used only in
the sense of the formal entailment, which is of course the intent of the
RDF Semantics.

However, considering the modelling perspective (in the RDF sense, not in
the sense of an interpretation theory), asserting

G1: {
ex:pat rdf:type ex:human .
_:x rdf:type ex:
}

might seem intuitively stronger than asserting

G2: {
ex:pat rdf:type ex:human .
}

even if G2 formally allows inferring G1. This is not due to entailment but
to "truth" but a potential world which models G1. Of course, RDF Semantics
is a vehicle for analysing valid inference, but still.

The reason is that asserting facts might be interpreted from an
"economical perspective" (as few assertions as possible). An agent
asserting G1 might assert redundant information due a simple bookkeeping
mistake, or (in the absence of negation) it might try to say that "Pat is
a human, but there exists also something else (which I can't or am
unwilling to identify) that is/are human."

In FOL (with identity), this could be expressed as (excuse the syntax):

F1 = { type(pat,human), exists X: ( type(X,human) and not(X=pat))  }

Obviously, F1 can not be formulated in RDF (without ontologies); all you
can do is F2 (G1):

F2 = { type(pat,human), exists X: type(X,human) }

The "closest one can get" in RDF (without ontologies) is noticing that,
assuming M is a model of G2, one can (for some M) invent a "bad" mapping A
for the blank node _:x in G1 so that G1 is not true.

Obviously, from the entailments' point of view, this is not a big deal
since there exists also a mapping (leading to the lean graph in our case)
which is sufficient for the entailment.

On the other hand, assuming for a while that an agent would like to
analyse the "truth" (possible models/worlds) of a given RDF graph (e.g. to
match a procedural schema, script, or whatever, to initiate a behaviour),
it's tempting to say (from the "economical perspective") that G1 encodes
more information than G2 (to base the behaviour on).

The point is that in practice, one can not assume that an agent with
limited resources would be able to perform all possible entailments before
such a matching (consider a more complex example). In other words, G1
encodes more information indirectly, not because it has more formal
entailment potential in it, but because it selects a useful entailment
(obviously, the syntax doesn't capture this intention): Enlisting theorems
in mathematics is useful even if, in principle, the are implicitly present
in the definitions (well, at least in the ideal case). And it is
particularly useful in Prolog or rule-based systems, to speed up
inference.

From the logical perspective, this line of thinking is a bit unorthodox,
of course, but I do wonder if people writing RDF assertions really are
logically-oriented.

However, I do believe that in order to prevent SW from semantic
fragmentation, W3C ought to publish an easy-to-read recommendation for
simple RDF modelling (the RDF Primer doesn't really touch this issue [for
scope, of course]).  This could include "safety levels" such as simple
assertions (ok to do even if not quite familiar with RDF Semantics),
assertions with blank nodes ("neutral structure slots" versus
"denotations"), assertions using the RDF vocabulary (e.g. interpretation
of bags), assertions of terminology (feasibility considerations), etc. A
sort of "know-what-you-are-asserting" guide (see my arms swinging? ;) I
would be among the first people wanting to read it (obviously, since I
have all these questions...).

But, as said in the beginning, your answer is sufficient. The above line
of thinking is not the problem of RDF Semantics.

Thanks again and best regards,

--Ossi

Received on Friday, 5 December 2003 03:52:11 UTC