Re: RDF too geeky, and what to do about it from Frank Manola on 2004-12-20 (www-rdf-interest@w3.org from December 2004)

From: Frank Manola <fmanola@acm.org>
Date: Mon, 20 Dec 2004 17:09:37 -0500
To: Adrian Walker <adrianw@snet.net>
CC: Danny Ayers <danny.ayers@gmail.com>, peter.hunsberger@gmail.com, www-rdf-interest@w3.org
Message-ID: <41C74DA1.8090509@acm.org>
Adrian Walker wrote:
> Danny, Peter --
> 
> Here are my two cents about...
> 
> /"I don't know if it's the semantics or what, but for some reason RDF
> just comes across as too geeky for the business side of the house.
> Maybe it's just that they've been hearing OO for 10 years and believe
> that "Objects" are supposed to be something good so they instantly
> adopt them. (Resource? What's a resource?)."

Adrian--

It seems to me the context of the original discussion (correct me if I'm 
wrong) was that the business folks could understand fairly straight XML 
pretty well (even though it was pretty much equivalent to the RDF), but 
there seemed to be extra concepts in the RDF that created problems.  So, 
without getting into the XML vs. RDF argument (and after all, you can 
translate most if not all of the reasonable XML markup languages into 
RDF without too much trouble if you need to), if the business folks were 
happy with XML, they didn't necessarily need natural language did they?

> 
> /The e-Government Presentation at www.reengineeringllc.com 
> <http://www.reengineeringllc.com/> argues that RDF is way too geeky, 
> that this will be dangerous in real world applications, and that there 
> is something we can do about it without throwing out the RDF baby with 
> the bathwater. 

The presentation certainly argues a need for "natural language" for 
communicating with (many) humans, and I don't really disagree.  However:

a.  It seems to me that the Clyde example you use really illustrates the 
dangers that can occur using *natural language*, rather than artificial 
languages.  If I'm not misunderstanding the example, the problem occurs 
because someone uses "is a" (a natural language phrase) as if it meant 
both "is an instance of" and "is a subClass of", and then reasons as if 
it meant only one of those things.  Your controlled English vocabulary 
distinguishes between "is a member of the set" and "is a named subset 
of", but so does RDF (rdf:type and rdfs:subClassOf), and certainly 
anyone using RDF would be unlikely to assume that they meant the same 
thing (even if there was confusion about what they *did* mean).  In 
either case, the user needs to understand the difference between these 
two concepts in order to use them properly.  Just because your 
vocabulary provides distinct "English" for these concepts doesn't 
necessarily mean that users will know "which English" to use in which cases.

b.  It's very important to distinguish between true natural language and 
controlled natural languages (which, I believe, is what you're 
proposing).  There have been a number of these languages proposed 
(there's some discussion going on right now on the SUO email list on 
this topic as well).  They can certainly be helpful in some 
circumstances (I've seen some work that looks reasonable on applying 
this approach in defining security policies, for example).  However, 
care is needed in the general case.  If you know you're talking to a 
machine (as you are in specifying machine-interpretable semantics), you 
need to keep in mind how the machine is going to interpret your "natural 
language" so as to couch it properly.  It's the same idea as writing a 
contract.  A contract is something that may wind up being interpreted by 
a court, and if the contract is at all complicated, you may want to talk 
to a lawyer, who will couch your wishes in a somewhat different "natural 
language" that the court will interpret properly (the way you intended). 
   This itself is something that's quite familiar in a business context.

> 
> That something we can do is to add some real world semantics -- far 
> beyond the limited view of semantics as equal to type information inside 
> all those angle brackets.

We certainly need to add those semantics, and you're right that it needs 
to be beyond the limited view of semantics that can currently be 
represented in RDF, RDFS, and OWL.  However, just as we need to be 
concerned about how to convey semantics to people, we need to be 
concerned about how to convey them to machines, if machines are going to 
be able to do anything with them.  That's why we have artificial 
languages, and need to go beyond OWL (SWRL is an example).  If we're 
going to use controlled natural language to communicate between people 
and machines, there needs to be a well-defined translation from the 
controlled natural language to the artificial ones (i.e., I don't think 
the translation should be buried inside the code of the controlled 
natural language interpreter).

Moreover, as I observed earlier, just because a controlled natural 
language looks like natural language doesn't mean that it will always be 
easy for non-informed users to interpret or use it properly 
(particularly in complicated examples, and especially when communicating 
from the human to the machine instead of vice-versa).

--Frank
Received on Monday, 20 December 2004 22:03:42 UTC