Re: plural vs singular properties (a proposal) from Garret Wilson on 2008-01-05 (semantic-web@w3.org from January 2008)

From: Garret Wilson <garret@globalmentor.com>
Date: Sat, 05 Jan 2008 14:48:47 -0800
To: Bijan Parsia <bparsia@cs.man.ac.uk>
CC: SW-forum Web <semantic-web@w3.org>
Message-ID: <4780094F.2040702@globalmentor.com>

Bijan Parsia wrote:
>
> On Jan 5, 2008, at 3:16 PM, Garret Wilson wrote:
>
>> Garret Wilson wrote:
>>> How does RDF justify repeated properties?
>>
>> Sorry, maybe I would have been clearer if I would have said, 
>> "multivalued properties", in which a property appears multiple times 
>> for the same resource but with different values, e.g.
>>
>> <rdf:Description>
>>  <dc:subject>semantic web</dc:subject>
>>  <dc:subject>database</dc:subject>
>> </rdf:Description>
>>
>> This is an area of conflict with the relational model, and I'm 
>> wondering how RDF justifies this.
>
> It's not a conflict. consider the following table version (let's give 
> the blank node an indentifier, ex:foo; blank nodes *are* a departure, 
> but one well handled by "naive" tables or "codd" tables as they are 
> known in the literature). First row are the column headers:
>
> ID            dc:subject
> ex:foo     semantic web
> ex:foo     database
>
> This a perfectly fine relational table (to be a bit loose in terminology).

As Frank Manola and I discussed elsewhere on this thread, it all depends 
on how you choose to interpret the semantics being represented by the 
relational model.

If you're going to use the relation header (i.e. the table column names) 
to represent predicates, I believe it's customary to interpret each 
tuple (i.e. row) as describing a distinct entity (i.e. resource). I 
don't think that using multiple rows to describe the same resource is a 
natural interpretation of relational semantics.

For example, what happens with additional predicates? What if ex:foo has 
two subjects ("semantic web" and "database"), but only one title ("My 
Book")? If you were to incorporate that data into your table above, 
you'd have two rows, both describing the same book, but one row would 
have no value in the dc:title column.

ID         dc:subject    dc:title
ex:foo     semantic web  My Book
ex:foo     database      ???

Sure, you could then throw a NULL in there (which means we're no longer 
talking about the relational model) or create some other workaround, but 
the fact that we have to create a workaround.

Then what do you do about keys? The resource URI (ex:foo) would be a 
natural key, but you can't do that because you're using separate tuples 
to represent the same resource. If you say, "well, we'll use all tuple 
values as a key," you run into more problems because your model of 
representation allow for multiple representations; that is, the table 
below is semantically equivalent to your table above, but would have 
different keys:

ID         dc:subject    dc:title
ex:foo     semantic web  ???
ex:foo     database      My Book

This brings me back to my original point: it is not straightforward to 
map all RDF data onto the relational model, if you interpret tuples as 
describing resources, because RDF allows multivalued properties.

Frank Manola's interpretation of the relational model, in which the 
table effectively describes a reification of an RDF graph (i.e. tuples 
are triples, not resources) is not an obvious one, but it is consistent 
and fully compatible with the relational model.

Garret

Received on Saturday, 5 January 2008 22:50:36 UTC