Re: plural vs singular properties (a proposal)

Bruce D'Arcus wrote:
> On 10/18/07, Garret Wilson <garret@globalmentor.com> wrote:
>
> ...
>
>   
>> When discussing the RDF version of VCard on this list, we spent weeks
>> arguing about whether the vcard.familyName property should take a single
>> value or a list, because value order might be important. With URF, the
>> issue is moot. You can order any property, and only use urf.List when
>> you're actually talking about a list of things. You can see that this
>> problem doesn't come up at all in the URF VCard ontology at
>> <http://www.urf.name/vcard.turf>.
>>     
>
> Question: are the difficulties around these issues in RDF a
> consequence of the open world assumption, and the need to be able to
> merge statements from disparate graphs?
>
>
>   

No, I don't think that's the core issue here. The issue is simply that 
there are many types of relationships in the universe, whether it's open 
or closed, that cannot easily be represented by RDF because RDF doesn't 
allow properties to be ordered. Needing to order properties without 
resorting to heavy list-list classes is a *very* common use-case. Let me 
give a few examples. (The 99% and 1% should be understood as exemplary, 
not as actual percentages.)

* 99% of the people in the world have a single last name, or multiple 
last names that can come in any order. 1% of the people have multiple 
last names for which order of these names is important. Do we create an 
ontology in which the ex:lastName property takes only a single string 
value, or do we make every ex:lastName property take an rdf:List, just 
so that 1% of the people can represent their names correctly? (Note that 
the problem here is not the open world assumption---maybe we know 
exactly who those 1% are, and what their names are.)

* 99% of test questions have hints that can come in any order. 1% of 
test questions have hints for which order is important. (e.g. Maybe the 
second hint depends on the first.) Do we create an ontology in which the 
ex:hint property takes a single value, or do we make every ex:hint 
property take an rdf:List, just to meet the needs of those 1% of test 
questions? (Note that I took the latter approach in my MAQRO ontology 
<http://www.maqro.org/> back in 2004.)

* 99% of people in the world have a single telephone number, or they 
don't care which of their multiple telephone numbers they use. But for 
1% of people it's important that some telephone numbers are of a higher 
priority than others. Do we create an ontology in which the ex:tel 
property takes a single value, or do we make every ex:tel property take 
an rdf:List, just to meet the needs of those 1% of people?

* 99% percent of people in your email address book have a single email 
address. But 1% of them have multiple email addresses, and it's 
important to keep track of which is the primary email address. Do we 
create an ontology in which the ex:email property takes a single value, 
or do we make every ex:email property take an rdf:List, just to meet the 
needs of those 1% of people?

So the problem is not that our data is incomplete. Even if we knew all 
the information about the universe, we'd be faced with the following 
problem:

* 99% of properties, at least in some situations, will need to represent 
multiple values for which order is important.
* However, for each of those properties, 99% percent of the time only a 
single value will be needed.

So what do you do---make every property in your ontology take an 
rdf:List value that will be overkill 99% of the time? (The number of 
problems that rdf:List introduces, such as querying, is another issue 
that I haven't even touched upon.) Or do you simply pretend that 1% of 
the world doesn't exist, because your framework of choice (RDF) makes it 
hard to support them? (That seems to be the trend in many ontologies 
I've seen, but I dismiss that option as being in the same category as 
using ASCII for everything just because I don't happen to speak Oriya 
and "They should learn English anyway.")

I think that URF has solved this problem in a very elegant way. 99% of 
the data is handled just like you'd expect it to be: as normal 
single-value properties. The other 1% is handled transparently. You only 
know about the ordering if you care to ask; if don't ask, it's still 
preserved in the background. As a side benefit, the idea of scoped 
properties allows all sorts of other nice things such as elegant complex 
data (e.g. units)---but that's a discussion for a different thread.

Best,

Garret

Received on Thursday, 18 October 2007 18:31:21 UTC