vCard confusion and RDF insufficiency from Garret Wilson on 2007-07-26 (semantic-web@w3.org from July 2007)

From: Garret Wilson <garret@globalmentor.com>
Date: Thu, 26 Jul 2007 08:29:47 -0700
To: Semantic Web <semantic-web@w3.org>
Message-ID: <46A8BDEB.4010402@globalmentor.com>
Just when I thought I had introduced a nice compromise to help get vCard 
RDF wrapped up, and after delivering an updated spec to Harry, all hell 
breaks loose when I point out (just trying to be completely open about 
the pluses and minuses of everything!) that lists have problems with 
literals in RDF+XML. May I point out just a few things:

    * We're using RDF because we think it's a good way to model
      information about the world.
    * Names are very common things in the world, but they are also very,
      very, very simple things. If you think modeling names is hard, try
      modeling the concept of a "contract" in common law countries or
      even in civil law countries. But contracts and the like are things
      that we want to model in RDF. If you can't model a little old name
      in RDF, you're screwed.
    * It's really difficult to model a little old name in RDF.
    * The uproar and confusion over vCard names has to do with a simple
      property of names: name components may be single entities or there
      may be several of them, with order important. This is not a unique
      situation---perhaps it's the rule with real-world data, not the
      exception. But everybody agrees about the real-world data
      here---the problem is with its representation in RDF, and (with
      the most recent discussion) its serialization in XML (i.e. you
      can't easily represent literals in a list). But take a step back
      and see what we're doing: we're not arguing about the real-world
      data, we're arguing because our modeling language of choice makes
      us jump through all sorts of hoops to model that real-world data.
      The teeny, tiny very, very, very simple real-world data.

The issue here is not just that restricting cardinality of all 
properties will be hard for Garret converting legacy vCard data. It's 
that restricting the cardinality of all properties doesn't model the 
(very, very, very simple) real-world data very well, and the only reason 
we're considering restricting cardinality is that we have to jump 
through all sorts of ridiculous hoops just to model something that can 
either be a single object or a list; or a list of literals. That's an 
inherent problem of the modeling language we're using. Doesn't that 
worry you? It worries me.

This is nothing new. When I was co-chair of the Open eBook Publication 
Structure 2.0 effort circa 2001, I spent untold effort trying to 
convince the group to switch to RDF for representation of its package 
format. Renato Iannella and Patrick Stickler will vouch for me on 
this---one of my last emails to Renato before the whole effort stalled 
was extolling the virtues of using a semantic modeling language such as 
RDF over a syntax-only format such as XML. But even with RDF, there were 
major problems for a packaging language. Take externally declaring 
stylesheets of an XML document, for example. With the resulting XPackage 
format (which made it into the RDF Primer), I finally settled on having 
a single x:style property (see 
http://www.xpackage.org/specification/#x:style ), but the same problem 
crops up: XML documents may have multiple styles, but they may be 
ordered (by priority) as well! So should we make x:style have a range of 
rdf:List? Why not give everything a range of rdf:List?

Consider an ontology for representing test questions such as my MAQRO. A 
particular question may provide a hint. It may provide multiple hints. 
These hints may have an order (the most opaque hints first, for 
example). I broke down and made maqro:hints take a a range of rdf:List 
for exactly the issue we're dealing with (see 
http://www.maqro.org/specification/#maqro:Question ), even though in 
most cases a question may only have one hint and rdf:List is overkill. 
And representing question text as literals? I created an entire class, 
maqro:Dialogue, just to use the rdf:value hack to get literals into a 
list. (see http://www.maqro.org/specification/#maqro:Dialogue ).

The problems were having with single/multiple cardinality and order; and 
representing literals in a list; are nothing new---they are problems 
inherent in RDF. They rear their heads in very, very, simple ontologies. 
And they aren't going away. That worries me greatly.

Garret
Received on Thursday, 26 July 2007 15:30:53 UTC