Re: RDF Investigations from Pat Hayes on 2013-06-25 (public-lod@w3.org from June 2013)

From: Pat Hayes <phayes@ihmc.us>
Date: Mon, 24 Jun 2013 23:02:24 -0500
To: Gregg Reynolds <dev@mobileink.com>
Cc: public-lod@w3.org
Message-Id: <B76B950C-313B-41AF-99F7-4F22A35E0321@ihmc.us>
On Jun 24, 2013, at 2:07 PM, Gregg Reynolds wrote:

> On Mon, Jun 24, 2013 at 9:32 AM, Pat Hayes <phayes@ihmc.us> wrote:
> 
> Hi, and thanks for the comments.  FYI I have some draft articles in
> the can that will add clarity and detail, I hope.  In the meantime ...
>> 
>> On Jun 23, 2013, at 11:49 AM, Gregg Reynolds wrote:
>> 
>>> Hi folks,
>>> 
>>> A couple of years ago I got the idea of finding alternatives to the
>>> official definition of RDF, especially the semantics.  I've always
>>> found the official docs less than crystal clear, and have always
>>> harbored the suspicion that the model-theoretic definition of RDF
>>> semantics offered in http://www.w3.org/TR/rdf-mt/ was unnecessary, or
>>> at least unnecessarily complicated.  Needless to say that is my own
>>> personal aesthetic judgment, but it did motivate my little project.
>>> 
>>> I guess the past two years have not been completely wasted on me; what
>>> was a somewhat vague intuition back then seems to have matured into a
>>> pretty clear idea of how RDF ought to be conceptualized and formally
>>> defined.  Clear to me, anyway; whether it is to others, and whether it
>>> is correct or not is a whole 'nother matter.
>>> 
>>> Since pursuing this idea will involve a lot of writing I won't pursue
>>> it here; instead I've described the the basic ideas in a blog post at
>>> http://blog.mobileink.com/.
>> 
>> Hmm. You say some things in there that seem to be just plain wrong.
> 
> 
>>> 1. [The RDF semantics] "restricts interpretation to a single semantic domain."
>> 
>> I am not sure how you can possibly read the semantics in this way, but the whole point of model theory is to permit many - usually, infinitely many - interpretations, over arbitrary domains. The only domain restriction in RDF (as in most model theories) is that the domain be non-empty and that it contain basic literal values such as character strings.
> 
> Point taken.  My statement was incorrect and needs to be changed; the
> point I was trying to get at is that RDF-MT seems to privilege the
> domain it defines - the set IR of Resources, etc.

Well, its a formal, artificial, language, and it comes with a semantics as part of its definition. Just like many other logics in many logic textbooks, many programming languages, etc.. So yes, I guess it does "privilege" that semantics, since that semantics is part of it (by definition).

>  The basic semantic
> constraints are stated in terms of this domain, which implicitly
> restricts semantic domains to those that have, for example, a set of
> binary relations for the properties.  But this is not necessary; you
> can define models that do not contain such relations.

You *can* (re)define RDF graphs to be a musical notation, or a way of drawing simple cartoons. So what? 

>  An obvious
> example is a set of objects N and the set of their triples NxNxN.
> (I'll describe this in more detail in a later blog article).

Have you checked out the mapping from RDF to FOL mentioned in passing in the 2004 Semantics document? It maps an RDF triple S P O to the atomic sentence triple(S, P, O). You might find it congenial. 

> 
>>> 2. "The so-called abstract syntax described  in RDF Concepts and Abstract Syntax serves as the formula calculus, but it is incomplete.  It specifies that a triple (statement) "contains" three terms (nodes), and that an RDF graph is "a set of triples".  But these are not rules of a calculus; they do not tell us how to construct statements in a formal language."
>> 
>> First, the whole point of defining an 'abstract' syntax is to allow for a variety of concrete (lexical) syntaxes, so if you prefer to work at a concrete level, just choose one of those, eg RDF/XML or N-Triples.
> 
> It just dawned on me that when people talk about the abstract syntax
> of RDF in this manner what they often mean is "abstract description of
> possible syntax (or set of syntaxes)".  Is that a fair description of
> what you have in mind?

That is one way to read it, but what I had in mind in using the term "abstract syntax" was the way it is used by John McCarthy (who coined the term originally), as syntax re-described as an algebra on terms and expressions. RDF uses graphs since its syntax is so extremely simple that it does not actually require any algebraic structure, but the basic idea is the same. 

>  I can't see any other way to read it, since by
> definition what is abstract cannot be written down, and if you cannot
> write it down you may be able to think about it but you cannot use it
> to communicate.

It is a structure (the graph) which can be described and its properties given precisely, and it can be directly represented in computer memory as a datastructure. That is enough to make it a syntax as far as I am concerned. What it is not, is a grammar defined on character strings. Concrete RDF syntaxes like RDF/XML and NTriples can be described this way, of course (though for XML, better ways are available.)

>  You can publish a document that describes a class of
> syntaxes abstractly, but you cannot publish and abstract syntax.

Sure you can. We did. 

> 
> I suppose one could describe an abstract syntax by referring only to
> syntactic positions and symbol classes; e.g. for Lisp something like
> "the first symbol must be an opening delimiter, the second a function
> symbol, " and so forth.  But this would be useless for model theory,
> which needs not only symbols but tokens.

Before going any further with this line of thinking and argument, read at least 

http://www-formal.stanford.edu/jmc/towards/node12.html

and preferably the book

http://books.google.com/books?id=YzLtfOjeJdsC&dq=John+McCarthy+abstract+syntax&source=gbs_navlinks_s

> 
> Actually SGML did something like this; it's the only language I know
> of that describes something approximating an "abstract syntax".  But
> its "abstract syntax" is in fact concrete; it uses symbols like DELIM
> (made that up, don't recall the exact expression) for concrete symbol
> classes.  But that makes for a meta-syntax, not an abstract syntax.
> There's nothing abstract about it; it's a concrete syntax that
> describes a class of other concrete syntaxes.  One can think of it as
> *expressing" a generalization or abstraction, but that's a lot
> different than saying it *is* abstract.  A meta-syntax of this
> character is what RDF lacks.
> 
>> But more to the point, the abstract graph syntax *is* a formal language with a perfectly well-defined syntax. It is not a character-string syntax, but it is a syntax, with exact syntactic rules. A very simple syntax, but that simplicity was a deliberate part of the design.
> 
> Can you point me to the rule that says how to write down a triple so
> that I can specify an interpretation for it?

What do you mean by "write down" a triple? The graph syntax does not refer to strings: if you want character strings, use something like N-Triples. In the graph syntax, a triple is, literally, a triple: three things in a fixed order. The semantics is defined on that as a syntactic construction. 

> Here's an easy example off the top of my head of what I would count as
> a meta-syntax for (part of) simple RDF:  define A, B, C, ... as
> (meta-)constants, x,y,... as variables. Define as formula schemata
> ABC, xBC, ABx, xBy, etc. (blank nodes).  (I'm ignoring literals for
> simplicity's sake).  Define "," as logical and.  Treat xBC as
> equivalent to "Exists(x).xBC", etc.  Very simple,

But not actually correct. (You need to handle blank node scopes better if you are going to use bnode identifers as variables.)

But how does this differ in anything other than what one might call style, from how RDF graphs are defined now? An RDF graph is a set of triples, and it "means" the conjunction of them. (Blank nodes are actually like *free* variables, understood as implicitly existential.)  Isn't that what you just said, above? 

> and it allows one to
> define inference schemata very concisely.  Note that you don't even
> have to mention IRIs.

Well, you do if you are going to define RDF syntax, as opposed to some other language using triples. 

> 
> Then define an appropriate syntactic equivalence relation and you have
> concrete syntaxes.
> 
>> 
>> 3. "... semantic entailment (not to be confused with logical entailment)..."
>> 
>> Can you elicidate what you see as this distinction that is not to be confused? The textbook account of a formal logic distinguishes entailment, a purely semantic notion, often symbolized by the sign |=, from deducibility (via formal inference rules and axioms, typically), often symbolized by |-, and completeness is the property of these two coinciding. I do not know of any notion of logical *entailment* other than the semantic |= notion. Deducibility is not entailment.
> 
> The idea is just to distinguish between P |= Q under one
> interpretation

I don't want to seem harsh, but it appears that you simply do not understand how model theory works. What P |= Q means is: for **every** interpretation I, if I(P)=true then I(Q)=true. So to qualify it with "under one interpretation" does not make sense. 

> , and P but not Q under another.  I.e. I1 is a model for
> both P and Q, but I2 is a model for P but not Q.  Then Q is not a
> logical consequence of P

indeed, as I2 shows

> , but it does follow from P under I1

That is meaningless. All that the existence of i1 shows is that P and Q are mutually consistent.

> .  Call it
> "I1-consequence".  I called it semantic consequence because it's there
> in the interpretation but falls short of logical consequence.  Maybe
> there's a better term for it?

There is no term for it because it does not make sense. 

> 
> But I think your first point also applies here so I need to clarify.
> The point would be to clarify whether entailments depend on selection
> of the semantic domain defined in RDF-Semantics, i.e. whether they are
> interpretation-specific.
> 
>> 
>> 4. "The business of model theory is to build a bridge between formal calculi and (informal) semantic domains.  You don't need a formal representation of the semantic domain..."
>> 
>> Model theory *is* the result of formalizing the semantic domain. That was the new idea in Tarski's original publication which founded the subject in the 1940s. HIs title, you might recall, was "A theory of truth for formalized languages".
> 
> Sorry, wasn't around in 1940 so I don't recall.  But I have moved my
> eyes across it; it's about formalized *languages*.

Often called formal logics, like RDF.

>  Model theory does
> not formalize any semantic domain - how could it, when it is domain
> agnostic by design?

It formalizes the *idea* of semantic domains. What we now call interpretations. An interpretation is a mathematical account of a semantic domain and how it affects truth of sentences. 

>  It's about how you relate the one (formal
> language) to the other (informal domains).

What do you mean by an "informal domain"? And where does model-theoretic semantics refer to these entities, whatever they are?

>  It's that relation that MT
> addresses and formalizes.  Actually "mathematicalizes" is more
> accurate than "formalizes".  That is not the same as providing a
> formalization of the domain.
> 
> (And Tarski's new idea was satisfaction, if I'm not mistaken.)
> 
>> 
>> 5.  "...model theory, ... makes automated proof a legitimate idea."
>> 
>> Proof theory makes automated proof a legitimate idea. Model theory establishes completeness of the formal proof methods.
> 
> We can quibble.  I guess "legitimate idea" is perhaps not the best
> choice of words here.  ;)
>> 
>> and I guess I won't bother to go on with this list.
>> 
> ...
>> But let me ask the larger question: what exactly is the point of this enterprise? Since the only point of inventing RDF in the first place was to provide for a basic degree of interoperability at a semantic level, what purpose could there be in ignoring this aspect of RDF?
> 
> Well, I guess one point might be to clarify what "interoperability at
> a semantic level" means; I'm not sure.  It may be a historical fact
> the the RDF definitions were at least in part motivated by a perceived
> need for semantics in some sense, but personally I've never been
> entirely convinced that a properly defined derivational calculus would
> not do as well, dispensing with the need for any sort of concrete
> semantic definition.  After all, most languages get along just fine
> without one.

Really? Most logical languages get along just fine without a semantics? 

But if you (or anyone else) wants to avoid even thinking about semantics, all you have to do is just look up the inference rules (In the 2004 Semantics, re-christened "entailment patterns" in the RDF 1.1 draft) and use them without asking about why they are correct. Think of RDF plus these patterns as a derivational calculus if you like. It's pretty straighforward to implement them so that they run blindingly fast. 

>  There's more to say on this, obviously but I'll save it
> for a blog post.
> 
> I also think there is some reason to think one might arrive at
> improved clarity of concept and exposition, especially for e.g.
> developers who don't know a whole lot about logic and don't really
> want to dig into the textbooks.

And you are going to make it easier for them by using category theory?!!?

There is no need to consult a textbook to understand RDF. The  RDF 1.1 draft specs has an intuitive summary of the RDF semantics which is about all that a developer needs to know. I quote it here in full:

"An RDF graph is true exactly when:

1. the IRIs and literals in subject or object position in the graph all refer to things,

2. there is some way to interpret all the blank nodes in the graph as referring to things,

3. the IRIs in property position refer to binary relationships,

4. and under these interpretations, each triple S P O in the graph asserts that the thing referred to as S, and the thing referred to as O, do in fact stand in the relationship referred to by P."

>> Considered as a pure, uninterpreted formal calculus, RDF is hardly there at all, it is so minimal. As you point out, it does not come with any proof rules or indeed even with any notion of proof already defined for it, and if you don't think the graph syntax is adequate, then it doesn't even come with a syntax. So it is hardly there at all: no wonder you could, if you were so inclined, make it into just about anything at all, if you ignore the normative semantics. If you want to have fun with formalisms, why not choose something with a bit more bite to it, such as an uninterpreted lambda-calculus, say? Or Javascript?
> 
> Hey, it's my fun, I get to choose!  Seriously, that's why I'm going
> with blog posts instead of long missives about my personal betes
> noires on W3C lists.

Fair enough. You go right ahead. 

>>> The allusion to Wittgenstein, that great
>>> philosophical therapist, is entirely intentional.  You (or at least I)
>>> find out a lot of things when you analyze a concept very closely; if
>>> my analysis is not mistaken, there are some fundamental problems in
>>> the land of RDF.  For example, it is possible to show, among other
>>> things, that the concept of a graph is not essential to RDF; nor is
>>> the treatment of the Property node of a triple as an arrow or relation
>>> necessary; nor is the concrete semantics defined in the RDF Semantics
>>> document the only or even the best "theory" of RDF.
>> 
>> If you can give up on all this, what do you take yourself to be referring to when you say "RDF" ? You have just dismissed virtually every defining characteristic of RDF as either wrong or inessential. So what is left?
> 
> Structure? Inference?  A reconceptualization of RDF as a species of
> something more general?  

Well, its pretty obvious that it is a species of something more general. It is in fact positive binary relational logic without negation or the universal quantifier. As such it is near the bottom of a huge heirarchy of more expressive fragments of first-order logic, which itself is at the bottom of an even larger hierarchy of more expressive logics still, including second- and higher-order logics of various kinds, modal quantifier logics, etc. etc. .

Another way to think about RDF is that is Percian Existential Graph notation without any negation. This is actually quite a good way to think about RDF and to see how to extend it to a more expressive logic. For an introduction to the idea, take a look at 
http://www.slideshare.net/PatHayes/blogic-iswc-2009-invited-talk
For more on Peircian graph notation, read 
http://www.jfsowa.com/cg/index.htm

> Or a clear exposition of how thinking of it
> as a kind of language or logic works just as well as thinking of it as
> a data model?  I guess I'll find out what's left when I get there.
> For the record, I have not dismissed anything, I've only said that
> various things that are often claimed to be essential are in fact
> optional.  

But they aren't optional, if you really are talking about RDF. For example, the RDF model theory is part of the normative specification. So if you produce some variant with a different semantics, then what you will have invented will not be RDF. By definition of "RDF". 

> That alone seems like a non-trivial conclusion (if it's
> correct).  And it's not the same as claiming they are meaningless or
> wrong.
> 
> I very much appreciate your taking the time to correspond, and I hope
> you don't take anything I've written as an attack on the official
> definition of RDF.  Critique, yes, but a critique is a good thing if
> (big if) it is well-reasoned.
> 
> Anyway that's only part of the project; another part is to post some
> articles clearly defining the various definitions etc. of model theory

I would strongly suggest learning a lot more about model theory and logic before going much further with this project.

> and the bits of logic needed to understand RDF-talk.  With tools like
> http://www.mathjax.org/ it is now possible to put pretty darned
> good-looking math and symbolic logic on the web.  The idea is to save
> others some of the time and effort it took me to track down and read
> good tech literature.  Sort of an "I read the technical stuff so you
> don't have to" thing.

Hmm. That is what I am afraid of. Ever read Pope's "Essay on Criticism"?

Pat

> 
> Cheers,
> 
> Gregg
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 25 June 2013 04:02:52 UTC