Re: [tangle] getting the semweb exactly wrong from Frank Manola on 2006-01-03 (semantic-web@w3.org from January 2006)

From: Frank Manola <fmanola@acm.org>
Date: Tue, 03 Jan 2006 17:12:12 -0500
To: Jan Algermissen <jalgermissen@topicmapping.com>
CC: Timothy Falconer <timothy@immuexa.com>, semantic-web@w3.org
Message-ID: <43BAF6BC.7030001@acm.org>

Jan Algermissen wrote:
> 
> On Jan 3, 2006, at 6:35 PM, Frank Manola wrote:
> 
>> RDF data is *highly* normalized: RDF essentially organizes data as  
>> binary relations (one per property) with surrogate keys (URIs),  which 
>> is as normalized as you can get.
> 
> 
> I just love it to see people making comparisions between the rock- 
> solid, stone-aged dinosaur 'relational data model' and stuff like RDF  
> (or Topic Maps for that matter)...which are essentially also data  
> models in the exact same sense. I'd go as far as saying that they are  
> direct competitors to the relational model - though this is sometimes  
> difficult to see given the decades that the relational model is ahead  
> in terms of theoretical analysis and implementation experience.
> 

Actually, I'm going a bit further than you may think I am.  I don't 
think RDF is a competitor to the relational model in the sense that it's 
a totally different model.  I think RDF is a *special case* of the 
relational model, with fixed-arity relations, use of URIs as primary key 
values, more flexibility in the use/nonuse of schemas, etc.  Both have 
their roots in predicate logic, after all.  Much of the theoretical 
analysis that has gone into the relational model, at least that which 
applies to modeling, can be applied to RDF as well.  I suspect that more 
can be learned from relational implementation experience than at first 
meets the eye as well (although you might well need to go beyond what 
current relational products offer at the implementation level).

> 
> Especially if we take into account that much of the relational body  of 
> theory is about dealing with 'problems' introduced by the inherent  
> typing. With RDF et al. there just is no such thing as normalization  
> issues, null values, ternary logic - not to mention the integration  
> problems induced in the long run.
> 

There's a lot of truth in this, but at the same time some of the 
"problems" involved are really pretty fundamental to data modeling, and 
in escaping a problem by using RDF you often have to deal with an 
"inverse" kind of problem.  For example, in RDF you escape normalization 
issues by forcing all data into a single, highly-normalized, fixed-arity 
relational form.  Then, because people often really think in terms of 
n-ary relations, you need a document from the Semantic Web Best 
Practices group on how to define N-ary relations in RDF (hopefully we'll 
escape the need for a document from some future W3C group on how to 
normalize those N-ary relations!).  I'm not complaining, mind you, or 
attempting to detract from the value of the work being done by the SWBPD 
group, merely making a (wry?) observation :-)

Similarly, you don't really escape the need to deal with n-ary relations 
in performing general queries on RDF data, as a look at SPARQL will tell 
you;  and you have to deal with closure in a less-direct way.  You also 
reintroduce something like null values in SPARQL as well (in the form of 
"unbound variables" in query results:  I know, I know, they aren't the 
same;  I said "something like").

Anyway, I think the trade-offs involved in using RDF in the Semantic Web 
are reasonable ones.  But I think sometimes that the differences with 
prior work (including work prior to the relational model) are sometimes 
exaggerated.

--Frank

Received on Tuesday, 3 January 2006 22:16:01 UTC