Re: vCard confusion and RDF insufficiency from Tim Berners-Lee on 2007-07-26 (semantic-web@w3.org from July 2007)

From: Tim Berners-Lee <timbl@w3.org>
Date: Thu, 26 Jul 2007 16:21:30 -0400
To: Garret Wilson <garret@globalmentor.com>
Cc: Semantic Web <semantic-web@w3.org>
Message-Id: <0EF40C9D-CAA9-4F9A-9DAD-211286A92738@w3.org>
On 2007-07 -26, at 11:29, Garret Wilson wrote:

>
> Just when I thought I had introduced a nice compromise to help get  
> vCard RDF wrapped up, and after delivering an updated spec to  
> Harry, all hell breaks loose when I point out (just trying to be  
> completely open about the pluses and minuses of everything!) that  
> lists have problems with literals in RDF+XML. May I point out just  
> a few things:
>
>    * We're using RDF because we think it's a good way to model
>      information about the world.
>    * Names are very common things in the world, but they are also  
> very,
>      very, very simple things. If you think modeling names is hard,  
> try
>      modeling the concept of a "contract" in common law countries or
>      even in civil law countries. But contracts and the like are  
> things
>      that we want to model in RDF. If you can't model a little old  
> name
>      in RDF, you're screwed.

Actually, names are complex because they have a history of  
development in the oral and written traditions of different cultures  
for thousands of years.

Similarly, modeling time one would expect to be totally simple, as it  
just starts at some zero -- pick one -- and goes forward in seconds,  
when in fact people need to be able to express  ""the first sunday  
after Easter" and stuff.

If we were starting to *design* names and time, from scratch, it  
would be a breeze.

>    * It's really difficult to model a little old name in RDF.

Well, a "little old name" is very complex as it is. But with the  
vCard ontology, we are trying to come up with an ontology which will  
round trip with another standard (or several) which have already been  
hacked as approximations to an attempt to model the 'little' old name.

>    * The uproar and confusion over vCard names has to do with a simple
>      property of names: name components may be single entities or  
> there
>      may be several of them, with order important. This is not a  
> unique
>      situation---perhaps it's the rule with real-world data, not the
>      exception. But everybody agrees about the real-world data
>      here---the problem is with its representation in RDF, and (with
>      the most recent discussion) its serialization in XML (i.e. you
>      can't easily represent literals in a list). But take a step back
>      and see what we're doing: we're not arguing about the real-world
>      data, we're arguing because our modeling language of choice makes
>      us jump through all sorts of hoops to model that real-world data.
>      The teeny, tiny very, very, very simple real-world data.

Agreed. It is totally a bug in RDF/XML, on which I only became aware  
of recently, that you can't serialize lists of literals in it.  I use  
them it seems a lot in N3.  We could think of filing it as a bug to  
be fixed in due course, in RDF/XML.

But mind you, a problem also is of course that we want be able to say  
in a huge number of cases just

dc:creator [ vc:firstName "Mark"; vc:secondName "Twain"]

or      :Joe foaf:knows [ foaf:name "Jim K. Oskeraro"]



> The issue here is not just that restricting cardinality of all  
> properties will be hard for Garret converting legacy vCard data.  
> It's that restricting the cardinality of all properties doesn't  
> model the (very, very, very simple) real-world data very well, and  
> the only reason we're considering restricting cardinality is that  
> we have to jump through all sorts of ridiculous hoops just to model  
> something that can either be a single object or a list; or a list  
> of literals. That's an inherent problem of the modeling language  
> we're using. Doesn't that worry you? It worries me.

It doesn't worry me that we can't model something which is either a  
single thing or a list well, because that is not what this is.

It is a list of things which in the vast majority of cases has one or  
zero elements.   So we want  a nice syntax for that .
vCard, like RFC822 headers,  use a separator so you can add commas
the problem with N3 is that you have to have the list

	  :MJB vc:firstName ("Mary" "Joe").
	  :EB   vc:firstName "Ethel".

would be bad modeling.  And I think also having a language withe the  
assumption that every "foo" is in fact a singleton list ( "foo") is  
also not wise.

> This is nothing new. When I was co-chair of the Open eBook  
> Publication Structure 2.0 effort circa 2001, I spent untold effort  
> trying to convince the group to switch to RDF for representation of  
> its package format. Renato Iannella and Patrick Stickler will vouch  
> for me on this---one of my last emails to Renato before the whole  
> effort stalled was extolling the virtues of using a semantic  
> modeling language such as RDF over a syntax-only format such as  
> XML. But even with RDF, there were major problems for a packaging  
> language. Take externally declaring stylesheets of an XML document,  
> for example. With the resulting XPackage format (which made it into  
> the RDF Primer), I finally settled on having a single x:style  
> property (see http://www.xpackage.org/specification/#x:style ), but  
> the same problem crops up: XML documents may have multiple styles,  
> but they may be ordered (by priority) as well! So should we make  
> x:style have a range of rdf:List? Why not give everything a range  
> of rdf:List?

(Actually, I think RDF misses sets.   A vast amount of data in XML is  
presented as ordered when it is not.  Sets are very common in data.   
I wish RDF had a simple syntax for a set.   I don't think one should  
use an ordered set to represent an unordered one or vice versa.   
Their notions of equality are very different.  Yosi Scharf added sets  
to cwm  ($  "a" "b" "c" $) as an experimental thing, with an owl  
semantics of  [ owl:oneOf ( "a" "b" "c" )] but the logical  
implications are not trivial. But I digress.)

> Consider an ontology for representing test questions such as my  
> MAQRO. A particular question may provide a hint. It may provide  
> multiple hints. These hints may have an order (the most opaque  
> hints first, for example). I broke down and made maqro:hints take a  
> a range of rdf:List for exactly the issue we're dealing with (see  
> http://www.maqro.org/specification/#maqro:Question ), even though  
> in most cases a question may only have one hint and rdf:List is  
> overkill. And representing question text as literals? I created an  
> entire class, maqro:Dialogue, just to use the rdf:value hack to get  
> literals into a list. (see http://www.maqro.org/specification/ 
> #maqro:Dialogue ).

Sigh.

> The problems were having with single/multiple cardinality and  
> order; and representing literals in a list; are nothing new---they  
> are problems inherent in RDF. They rear their heads in very, very,  
> simple ontologies. And they aren't going away. That worries me  
> greatly.
>
> Garret
Received on Thursday, 26 July 2007 20:21:36 UTC