Re: plural vs singular properties (a proposal) from Frank Manola on 2008-01-06 (semantic-web@w3.org from January 2008)

From: Frank Manola <fmanola@acm.org>
Date: Sun, 6 Jan 2008 06:48:22 -0500
To: Garret Wilson <garret@globalmentor.com>
Cc: SWIG <semantic-web@w3.org>
Message-Id: <8E55E196-0B3E-4BBB-879E-FE5B0C43753B@acm.org>
On Jan 5, 2008, at 6:02 PM, Garret Wilson wrote:

> Frank Manola wrote:
>>
>>
>> On Jan 5, 2008, at 5:11 PM, Garret Wilson wrote:
>>> Yes, obviously some RDF subsets can be mapped to the relational  
>>> model in several ways. As relating to my original question,  
>>> though, I could not choose the semantics "interpreted in an  
>>> obvious way" by Date as a way to support general RDF data. In  
>>> other words, Date's interpretation is *only* compatible with  
>>> storing a *strict subset* of RDF data; there exists RDF data that  
>>> would not fit into this interpretation. This is the sense in  
>>> which I meant that this interpretation (let's call it the  
>>> "obvious" interpretation, using Date's words) is incompatible  
>>> with RDF (the model) as a general interpretation, because it  
>>> cannot represent everything expressible by RDF.
>>
>> Sorry, I'm not following you.  Could you give an example of RDF  
>> data that wouldn't fit?  What I thought I just heard you say was  
>> that the relational model can't represent everything that's  
>> expressible by RDF, and that certainly isn't true (RDF can be  
>> thought of as a relational model that follows particular design  
>> rules).  [I'll take this thread up again tomorrow;  it's football  
>> time now!]
>
> Hi, Frank! Go enjoy your game---I think we agree and you just  
> missed an essential part of what I said.
>
> What I said (and I think you agreed with me on this elsewhere) is  
> that there exists RDF data that will not fit within the relational  
> model (important part coming up -->) *if* we we use the  
> interpretation that a relation header represents predicates and  
> each tuple represents a distinct resource (the "obvious"  
> interpretation used by Date and his wine bottles).

OK, this clarifies things.  I don't think I said this.  I may have  
been misinterpreted though;  I'm afraid my discussion got a little  
convoluted yesterday!.

What Date says (and you quoted him as saying this on p.72 in one of  
your earlier messages) is that the heading of each relation  
represents a certain predicate (not multiple predicates).  This is an  
n-ary (or n-place) predicate, corresponding with the number of  
columns in the relation.  In the wine example, Cellar is a 6-ary  
predicate, of the form Cellar( bin#, wine, producer, year, bottles,  
ready ).  The individual columns, even thought they represent  
distinct attributes of some entity (a wine in this case), are not  
distinct "predicates" in the relational model;  rather, the whole  
thing is one multi-argument predicate.  The columns would be distinct  
predicates in RDF, but that's because RDF would require you to break  
this 6-ary predicate down into separate binary predicates:

bin#(wineID, value)
wine(wineID, name)
producer(wineID, producerName)
etc.

And of course you could also, in the relational model itself, model  
the same data using a separate relation for each column, as RDF does.

As far as each relation representing a distinct resource, it depends  
what you mean by "resource" here (these aren't necessarily RDF  
resources).  In the case of the wine example, it happens that a  
single relation is used to represent all the data.  But not all  
relational designs can work like that.  Consider a person and his/her  
hobbies.  I'd need to model this as something like

Person (SSN, name, age, height)  (where SSN is assumed to identify a  
Person in this example)
Hobby ( SSN, hobbyname )

I need to do this because a person may have multiple hobbies.  I  
could do this:

PersonwithOneHobby (SSN, name, age, height, firstHobby)
PersonwithTwoHobbies (SSN, name, age, height, firstHobby, secondHobby)
etc.

but then I need a separate relation for people with each distinct  
number of hobbies.

Continuing with:

Person (SSN, name, age, height)
Hobby ( SSN, hobbyName )

what is the "resource" being represented?  In one sense, each tuple  
of each relation represents a single resource:  in the Person  
relation, it's a person, in the Hobby relation, it's the association  
of a hobby with a person.  But in another sense the resource you're  
describing with *both* relations is a person, and you use the  
separate hobby relation because you want to associate multiple  
hobbies with the same person (and avoid the need for separate Person  
relations with different arities).  This is shown by the fact that a  
person's SSN appears in both relations;  that's what the data is  
"about" (just as it is in RDF when you break up the information about  
a single resource into multiple relations).

In other words, for various technical reasons (normalization in the  
relational sense is another reason), I may want to break up the  
information about a single "resource" (in some sense) into multiple  
relations, each representing a separate predicate.  When I go far  
enough in this direction, using only binary relations (with the first  
column containing a resource ID), I'm essentially at RDF.

Is this clearer?


>
> If we instead use your interpretation, in which each tuple  
> represents an RDF triple, then all RDF data can fit as far as I can  
> see.
>
> Garret
Received on Sunday, 6 January 2008 11:48:34 UTC