Re: RDF Investigations

On Mon, Jun 24, 2013 at 9:32 AM, Pat Hayes <phayes@ihmc.us> wrote:

Hi, and thanks for the comments.  FYI I have some draft articles in
the can that will add clarity and detail, I hope.  In the meantime ...
>
> On Jun 23, 2013, at 11:49 AM, Gregg Reynolds wrote:
>
>> Hi folks,
>>
>> A couple of years ago I got the idea of finding alternatives to the
>> official definition of RDF, especially the semantics.  I've always
>> found the official docs less than crystal clear, and have always
>> harbored the suspicion that the model-theoretic definition of RDF
>> semantics offered in http://www.w3.org/TR/rdf-mt/ was unnecessary, or
>> at least unnecessarily complicated.  Needless to say that is my own
>> personal aesthetic judgment, but it did motivate my little project.
>>
>> I guess the past two years have not been completely wasted on me; what
>> was a somewhat vague intuition back then seems to have matured into a
>> pretty clear idea of how RDF ought to be conceptualized and formally
>> defined.  Clear to me, anyway; whether it is to others, and whether it
>> is correct or not is a whole 'nother matter.
>>
>> Since pursuing this idea will involve a lot of writing I won't pursue
>> it here; instead I've described the the basic ideas in a blog post at
>> http://blog.mobileink.com/.
>
> Hmm. You say some things in there that seem to be just plain wrong.


>> 1. [The RDF semantics] "restricts interpretation to a single semantic domain."
>
> I am not sure how you can possibly read the semantics in this way, but the whole point of model theory is to permit many - usually, infinitely many - interpretations, over arbitrary domains. The only domain restriction in RDF (as in most model theories) is that the domain be non-empty and that it contain basic literal values such as character strings.

Point taken.  My statement was incorrect and needs to be changed; the
point I was trying to get at is that RDF-MT seems to privilege the
domain it defines - the set IR of Resources, etc.  The basic semantic
constraints are stated in terms of this domain, which implicitly
restricts semantic domains to those that have, for example, a set of
binary relations for the properties.  But this is not necessary; you
can define models that do not contain such relations.  An obvious
example is a set of objects N and the set of their triples NxNxN.
(I'll describe this in more detail in a later blog article).

>> 2. "The so-called abstract syntax described  in RDF Concepts and Abstract Syntax serves as the formula calculus, but it is incomplete.  It specifies that a triple (statement) "contains" three terms (nodes), and that an RDF graph is "a set of triples".  But these are not rules of a calculus; they do not tell us how to construct statements in a formal language."
>
> First, the whole point of defining an 'abstract' syntax is to allow for a variety of concrete (lexical) syntaxes, so if you prefer to work at a concrete level, just choose one of those, eg RDF/XML or N-Triples.

It just dawned on me that when people talk about the abstract syntax
of RDF in this manner what they often mean is "abstract description of
possible syntax (or set of syntaxes)".  Is that a fair description of
what you have in mind?  I can't see any other way to read it, since by
definition what is abstract cannot be written down, and if you cannot
write it down you may be able to think about it but you cannot use it
to communicate.  You can publish a document that describes a class of
syntaxes abstractly, but you cannot publish and abstract syntax.

I suppose one could describe an abstract syntax by referring only to
syntactic positions and symbol classes; e.g. for Lisp something like
"the first symbol must be an opening delimiter, the second a function
symbol, " and so forth.  But this would be useless for model theory,
which needs not only symbols but tokens.

Actually SGML did something like this; it's the only language I know
of that describes something approximating an "abstract syntax".  But
its "abstract syntax" is in fact concrete; it uses symbols like DELIM
(made that up, don't recall the exact expression) for concrete symbol
classes.  But that makes for a meta-syntax, not an abstract syntax.
There's nothing abstract about it; it's a concrete syntax that
describes a class of other concrete syntaxes.  One can think of it as
*expressing" a generalization or abstraction, but that's a lot
different than saying it *is* abstract.  A meta-syntax of this
character is what RDF lacks.

> But more to the point, the abstract graph syntax *is* a formal language with a perfectly well-defined syntax. It is not a character-string syntax, but it is a syntax, with exact syntactic rules. A very simple syntax, but that simplicity was a deliberate part of the design.

Can you point me to the rule that says how to write down a triple so
that I can specify an interpretation for it?

Here's an easy example off the top of my head of what I would count as
a meta-syntax for (part of) simple RDF:  define A, B, C, ... as
(meta-)constants, x,y,... as variables. Define as formula schemata
ABC, xBC, ABx, xBy, etc. (blank nodes).  (I'm ignoring literals for
simplicity's sake).  Define "," as logical and.  Treat xBC as
equivalent to "Exists(x).xBC", etc.  Very simple, and it allows one to
define inference schemata very concisely.  Note that you don't even
have to mention IRIs.

Then define an appropriate syntactic equivalence relation and you have
concrete syntaxes.

>
> 3. "... semantic entailment (not to be confused with logical entailment)..."
>
> Can you elicidate what you see as this distinction that is not to be confused? The textbook account of a formal logic distinguishes entailment, a purely semantic notion, often symbolized by the sign |=, from deducibility (via formal inference rules and axioms, typically), often symbolized by |-, and completeness is the property of these two coinciding. I do not know of any notion of logical *entailment* other than the semantic |= notion. Deducibility is not entailment.

The idea is just to distinguish between P |= Q under one
interpretation, and P but not Q under another.  I.e. I1 is a model for
both P and Q, but I2 is a model for P but not Q.  Then Q is not a
logical consequence of P, but it does follow from P under I1.  Call it
"I1-consequence".  I called it semantic consequence because it's there
in the interpretation but falls short of logical consequence.  Maybe
there's a better term for it?

But I think your first point also applies here so I need to clarify.
The point would be to clarify whether entailments depend on selection
of the semantic domain defined in RDF-Semantics, i.e. whether they are
interpretation-specific.

>
> 4. "The business of model theory is to build a bridge between formal calculi and (informal) semantic domains.  You don't need a formal representation of the semantic domain..."
>
> Model theory *is* the result of formalizing the semantic domain. That was the new idea in Tarski's original publication which founded the subject in the 1940s. HIs title, you might recall, was "A theory of truth for formalized languages".

Sorry, wasn't around in 1940 so I don't recall.  But I have moved my
eyes across it; it's about formalized *languages*.  Model theory does
not formalize any semantic domain - how could it, when it is domain
agnostic by design?  It's about how you relate the one (formal
language) to the other (informal domains).  It's that relation that MT
addresses and formalizes.  Actually "mathematicalizes" is more
accurate than "formalizes".  That is not the same as providing a
formalization of the domain.

(And Tarski's new idea was satisfaction, if I'm not mistaken.)

>
> 5.  "...model theory, ... makes automated proof a legitimate idea."
>
> Proof theory makes automated proof a legitimate idea. Model theory establishes completeness of the formal proof methods.

We can quibble.  I guess "legitimate idea" is perhaps not the best
choice of words here.  ;)
>
> and I guess I won't bother to go on with this list.
>
...
>But let me ask the larger question: what exactly is the point of this enterprise? Since the only point of inventing RDF in the first place was to provide for a basic degree of interoperability at a semantic level, what purpose could there be in ignoring this aspect of RDF?

Well, I guess one point might be to clarify what "interoperability at
a semantic level" means; I'm not sure.  It may be a historical fact
the the RDF definitions were at least in part motivated by a perceived
need for semantics in some sense, but personally I've never been
entirely convinced that a properly defined derivational calculus would
not do as well, dispensing with the need for any sort of concrete
semantic definition.  After all, most languages get along just fine
without one.  There's more to say on this, obviously but I'll save it
for a blog post.

I also think there is some reason to think one might arrive at
improved clarity of concept and exposition, especially for e.g.
developers who don't know a whole lot about logic and don't really
want to dig into the textbooks.

> Considered as a pure, uninterpreted formal calculus, RDF is hardly there at all, it is so minimal. As you point out, it does not come with any proof rules or indeed even with any notion of proof already defined for it, and if you don't think the graph syntax is adequate, then it doesn't even come with a syntax. So it is hardly there at all: no wonder you could, if you were so inclined, make it into just about anything at all, if you ignore the normative semantics. If you want to have fun with formalisms, why not choose something with a bit more bite to it, such as an uninterpreted lambda-calculus, say? Or Javascript?

Hey, it's my fun, I get to choose!  Seriously, that's why I'm going
with blog posts instead of long missives about my personal betes
noires on W3C lists.
>
>>  The allusion to Wittgenstein, that great
>> philosophical therapist, is entirely intentional.  You (or at least I)
>> find out a lot of things when you analyze a concept very closely; if
>> my analysis is not mistaken, there are some fundamental problems in
>> the land of RDF.  For example, it is possible to show, among other
>> things, that the concept of a graph is not essential to RDF; nor is
>> the treatment of the Property node of a triple as an arrow or relation
>> necessary; nor is the concrete semantics defined in the RDF Semantics
>> document the only or even the best "theory" of RDF.
>
> If you can give up on all this, what do you take yourself to be referring to when you say "RDF" ? You have just dismissed virtually every defining characteristic of RDF as either wrong or inessential. So what is left?

Structure? Inference?  A reconceptualization of RDF as a species of
something more general?  Or a clear exposition of how thinking of it
as a kind of language or logic works just as well as thinking of it as
a data model?  I guess I'll find out what's left when I get there.
For the record, I have not dismissed anything, I've only said that
various things that are often claimed to be essential are in fact
optional.  That alone seems like a non-trivial conclusion (if it's
correct).  And it's not the same as claiming they are meaningless or
wrong.

I very much appreciate your taking the time to correspond, and I hope
you don't take anything I've written as an attack on the official
definition of RDF.  Critique, yes, but a critique is a good thing if
(big if) it is well-reasoned.

Anyway that's only part of the project; another part is to post some
articles clearly defining the various definitions etc. of model theory
and the bits of logic needed to understand RDF-talk.  With tools like
http://www.mathjax.org/ it is now possible to put pretty darned
good-looking math and symbolic logic on the web.  The idea is to save
others some of the time and effort it took me to track down and read
good tech literature.  Sort of an "I read the technical stuff so you
don't have to" thing.

Cheers,

Gregg

Received on Monday, 24 June 2013 19:08:15 UTC