RE: What do the ontologists want from pat hayes on 2001-05-19 (www-rdf-logic@w3.org from May 2001)

From: pat hayes <phayes@ai.uwf.edu>
Date: Fri, 18 May 2001 23:29:48 -0500
To: "Ziv Hellman" <ziv@unicorn.com>
Cc: www-rdf-logic@w3.org
Message-Id: <v04210139b72b968b566f@[205.160.76.183]>
> > >I have no objection to binary predicates; I could
> > >even live with all predicates being binary if it would allow me to
> > >speak for lots of ontologists. :)
> >
> > The restriction to binary (plus unary, ie at-most-binary) predicates
> > is mildly inconvenient but quite live-with-able, I agree. That's two
> > ontologists on the list.
> >
> > Pat
>
>At the risk of being on the receiving end of a hailstorm of flames 
>from the regulars on this list, I will toss a spanner into the works 
>here and question the use of triples.
>
>As correctly pointed out above, using triples is essentially 
>reducing everything to binary predicates. Now it is certainly 
>provably true that every multi-ary relation can indeed be reduced to 
>a collection of binary predicates, and this has been known for a 
>very long time. The RDF spec even notes this and provides examples 
>for doing so. The question is whether too high a price is paid in 
>certain cases.
>
>On the one hand, essentially reducing the world to binary predicates 
>is what the OO and XML communities have done for a long time, with 
>the attributes assigned to objects really being binary predicates. 
>This viewpoint can be understood as stemming from looking at most 
>relations as functional, in the sense that, as the canonical RDF 
>example puts it, if one asks "who is the creator of this resource?" 
>and the answer is "Ora Lassila", then one is working with a binary 
>predicate associating a specific resource with a specific person. So 
>far so good.
>
>On the other hand, standard mathematics and logic, KIF, the 
>relational data-base world, and even full-power UML, all permit the 
>use of multi-ary relations and do not limit themselves to binary 
>predicates. Why?
>
>I think the reason has to do with the fact that although it appears 
>at first that one is gaining simplicity by using only binary 
>predicates, or encoded triples, in practice when one is forced to 
>exchange a straightforward n-ary predicate with a clumsy collection 
>of binaries, the simplicity one has seemingly gained is more than 
>lost in the translation. If we really are going to create a 
>world-wide web of semantic meanings for a plethora of daily needs, 
>this issue may need to be addressed again down the road.
>
>Take as simple an example as requesting a bank balance. This 
>requires a relation that is at least 3-ary: at minimum one needs the 
>account number and the date&time. The balance cannot be assigned as 
>a simple attribute of the account, because its value changes with 
>time, and it certainly is not an "attribute" of the date&time alone. 
>For another, more complicated example that is a canonical one I use, 
>consider a travel agent asked by a customer the flight seating 
>he/she has been assigned. The travel agent will respond that in 
>order to answer the question, one needs to know at minimum the 
>quadruple of {name of the customer, the date of the flight, the 
>airline carrier, the flight number} -- because the seating of a 
>particular person on a particular flight is not an attribute of any 
>one element in that list, but an attribute of the full quadruple.
>
>Again, I know that these examples can be reduced to encoded triples 
>-- but is the resulting clumsiness worth it compared to the 
>straightforward multi-ary statement? And perhaps more to the point, 
>consider that in order to really take off, the SW will eventually 
>have to come into contact with the data the world has stored in 
>relational data-bases, which routinely make use of reams of tables 
>representing very large multi-ary relations. If the industrial world 
>is told that uploading/downloading this data through the SW will 
>require painfully chopping up the tables into an explosion of 
>triples, waiting for the transmission traffic to complete and then 
>reconstituting from them the tables at the other end, one may fear 
>that it will recoil in horror.

(Late reply, sorry)

I entirely agree with you, but I do not expect this view to win the 
day, so I have decided to give this issue up without a fight.

This is a quarrel that has been repeated over and over again in many 
different areas, including Krep, linguistics, and philosophical 
logic, as well as several computer-science-related areas. Recently 
the 'standard upper ontology' discussion lists have been rehashing 
it. I can see good arguments both ways, unfortunately.

The good argument for the binary case has been given by Seth and 
Sandro. It was used in linguistics by Davidson long ago, who said 
that the best way to think of the meaning of a simple (one main verb) 
English sentence was in terms of a single 'event' indicated by the 
verb, and then a lot of relationships of various other things to that 
event. The example he used was "He did it in the kitchen, quickly, 
silently, at midnight, with a knife..." , where 'it' turns out to be 
making a ham sandwich. The point being that it is impossible to say 
how many extra qualifications you might get, and if you have to model 
them as arguments or parameters, then the arity (number of arguments) 
has to keep changing. Moreover, many of the binary relationships seem 
to be already encoded in ordinary grammar, often called 'cases', so 
there is an agent (the subject) and an object (the object of the 
sentence) and maybe an instrument, and a time (tense) and a manner 
(adverbs) and so on. This kind of analysis has ben very influential 
in ontology design because it is so handy in this way, particularly 
during the process of ontology design when things are changing.  And, 
indeed, any n-ary relation can be encoded in this way, with a little 
artifice, since one can think of the relational n-tuple as the 
'thing'. (Notice though that this reifies relational instances, not 
sentences, so the RDF account of reification seems to fail here.) 
That is why object-oriented or graph-based representations (like 
semantic networks) aren't *obviously* useless.

On the other hand, it can be argued that for many purposes, this 
flexibility is beside the point. Not all relationships are naturally 
expressible as simple English sentences. Some relations are known to 
be of a certain fixed arity, and there is no particular reason why 
one should not be able to take advantage of that knowledge when it is 
available, since it is about as easy to represent an n-ary relation 
as it is to represent a binary one in a sentential or tabular form. 
And since these formats do not rule out the binary case when that is 
useful, why not allow them to be used when appropriate, with 
concomitant savings in clarity and efficiency? (The binary expansion 
in general makes a single atomic sentence into a sequence of n-2 
atoms. Since search inefficiency tends to be exponential in inference 
depth, this can be a major computational cost.)  Allowing arbitrary 
numbers of arguments simply allows everything to co-exist; it does 
not prevent a Davidsonian analysis or a binary reduction if your 
tastes run that way.

The problem with this arises when people become committed to a format 
which is only capable of representing the binary case: graphs or OO 
notations.  Such enthusiasts often become so enamoured of their 
particular formats, and so enthusiastic about their perceived 
advantages, that they are hard to persuade; and since, as they will 
never tire of pointing out, the general n-ary notations *can* be 
translated into theirs, what rationale do we have for resisting their 
case? It is often useless, I have found, to try to tell them that any 
normal form is as good as any other, or to try to persuade them that 
a more eclectic approach has its advantages. The fact that a simple 
format is universal is indeed rather cute, and  it can be bewitching 
when you first encounter it.

Jim Hendler declared at the beginning of the DAML work that 'purely 
aesthetic' arguments would not be permitted to influence the design 
of the language, which applied in this context is a pre-emptive 
strike against any arguments based on the observations you produce. 
The fact that the entire world of mathematics, logic, and database 
engineering has chosen to use relations freely, is in the end only an 
aesthetic argument. It is *possible* to get used to the ugliness, 
inefficiency and style-cramping awkwardness that a purely binary 
language imposes, rather in the way that it is possible to get used 
to midwestern cooking. Transmission speeds are so fast, and memory so 
cheap, that any linear losses in information density do not have any 
really nasty economic consequences; so I have decided to let the 
clowns win this particular battle. If people wish to automatically 
translate an efficient notation into an inefficient one, just let 
them do it. Microsoft will do it anyway, whatever we decide.

I personally will continue to use relational languages in my own 
ontology work (in fact, KIF allows for variably polyadic relations, 
which can take any number of arguments, a distinct expressive 
advantage which makes many axiomatizations wonderfully compact: kudos 
to Mike Genesereth for thinking of it) but I doubt if the Semantic 
Web will.

Best wishes

Pat Hayes

---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Saturday, 19 May 2001 00:29:48 UTC