yet another RDF vCard

Everyone,

I apologize for coming late to the RDF vCard discussion. I've actually 
been involved with both vCard and RDF for quite a while. Four years ago 
I attempted to get in touch with Renato Iannella (whom I had worked with 
when I was helping create the Open eBook format) to propose an update to 
his W3C note on the subject, but I was unable to establish contact. I'm 
now working on a large project which will release a private alpha 
version in a matter of days which needs to store vCard information in 
RDF. As currently no standard seemed sufficient, I wrote a new proposal 
which I had intended to publish on a developer site within a few weeks.

Today I discovered the http://www.w3.org/2006/vcard/ns effort, and I 
read it with a mixture of joy and disappointment. I believe there are 
serious deficiencies in the current state of the ontology, which I'll 
enumerate. (My flippancy in the discussion below is intended to be 
humor.) I'll also include a text version of the current draft of my RDF 
vCard proposal article.

First, there are a few nice innovations in 
http://www.w3.org/2006/vcard/ns , such as using property subtypes to 
take the place of the various type definitions in vCard. Using v:homeAdr 
more accurately reflects the intended semantics of the relationship than 
having different "types" of address objects.

On the other hand, though, the entire ontology has been so restricted as 
to become hardly usable in an international environment. A case in point 
is the v:Name class, which only allows a single instance of each name 
component. The names-in-FOAF discussion has already pointed to the 
Dublin Core analysis ( 
http://dublincore.org/documents/name-representation/ ) of the rich 
variety of naming conventions around the world. The current 
http://www.w3.org/2006/vcard/ns proposal could hardly handle the name of 
His Majesty King Abdullah II bin Al Hussein of Jordan. Sure, I know that 
we could take a couple of those name components, separate them with 
commas, and stick them into one of the existing v:Name components, but 
if we're going to do that why don't we do away with the v:Name class 
altogether and stick a literal vCard string into one component, like 
this: <v:n>Stevenson;John;Philip,Paul;Dr.;Jr.,M.D.,A.C.P."</v:n>?

I can imagine why http://www.w3.org/2006/vcard/ns is so restricted with 
all these 1:1 relationships: you'd like to specify value constraints, 
and OWL can't handle much more than 1:1 relationships. That, however, is 
a problem with the current state of OWL, and you're never going to get 
away from it. There are a lot of modeling deficiencies in RDF/OWL (Why 
can't literals be subjects? Why can't literals have properties? Why 
can't I constrain a value to be a list containing only certain types?), 
but the solution is not to try to conform the world to match your 
modeling framework---the modeling framework should be improved to be 
able to model the world.

The proposal I outline below isn't quite finished---I haven't addressed 
mapping types to RDF datatypes, and I want to include the elegant 
property subtype system used in http://www.w3.org/2006/vcard/ns ----but 
as far as data structure goes, it is consistent and covers virtually all 
the combinations available in the original vCard RFC. Sure, you'll have 
to use some programming logic to process the data, and pure OWL isn't 
going to be able to express the constraints presented in the proposal, 
but at some point to you're going to have to update OWL to include such 
common, real-world relationships instead of trying to get King Abdullah 
to change his name to "John Smith".

I'm guessing that HTML attachments aren't welcomed on the list, so I'm 
including the text of my proposal below; unfortunately that means that 
the reference links are lost. I welcome any feedback, and if the current 
http://www.w3.org/2006/vcard/ns proposal can be improved to accommodate 
the needs I've raised, I'd love to help create some specification that 
will work for everyone. I do have an application that needs to settle on 
an RDF vCard within a few days, but at the same time I want to produce 
interoperable data.

I hope to hear your comments.

Best,

Garret


vCard in RDF: Problems and Resolutions

By Garret Wilson

Version 2007-03-26

Lead

A proposal for RDF representation of vCard, with implications for the 
RDF Calendar project and names in FOAF.
Abstract

Metadata, Names, and the vCard

Metadata is not a new concept, and the idea of a name of a thing is one 
universal and very personal example of a metadata property that has 
existed since language began. The vCard, defined in RFC 2426, vCard MIME 
Directory Profile, is one standard method of storing and transferring 
metadata about a person, including that individual's name, address, and 
contact information. The vCard standard was defined early on in the 
Internet revolution, long before the release of the first RDF working 
draft, and it is still common to see vCards attached to emails, 
downloaded from web site, and used as a transfer mechanism between 
personal information managers such as Microsoft Outlook.

vCard was formulated before XML existed, and opts for its own simple 
text-based serialization. An example vCard appears below:

begin:vcard
version:3.0
fn:Dr. Mary Ann May Smith, Jr., M.D.
n:Smith;Mary;Ann,May;Dr.;Jr.,M.D.
n;language=es:Smith;Maria;Ann,May;Dr.;Jr.,M.D.
org:Example Corporation
adr;type=work,postal,parcel:;;123 Some Street;Springfield;NY;12345;USA
tel;type=voice,msg,work:+1-123-555-1212
email;type=internet,pref:jsmith@example.com
email;type=internet:jsmith@example.org
url:http://www.example.com/~jsmith
end:vcard

A single line from above example will help to better understand the 
general format of a vCard:

type;optionalParameters:single;value;or;structured;value

n;language=es:Smith;Maria;Ann,May;Dr.;Jr.,M.D.

As can be seen, each line of a vCard begins with a type, followed by an 
optional set of parameter=value pairs. In the example above, the type n 
indicates that this is a vCard name definition. The optional parameter 
language indicates by es that this is the Spanish form of the name. 
After that appears the value; for the n type, the value is a structured 
value separated by semicolons, with the components indicating family 
name, given name, additional names, honorific prefixes, and honorific 
suffixes, respectively.

In today's world, with the popularity of the XML syntax, the semantic 
rigor of RDF, and interoperable ontologies such as those provided by 
Dublin Core and FOAF, creating a representation of vCard in RDF+ML would 
seem ideal and obvious. Indeed, in 2001 there was an effort to create 
just such a formulation; the result can be seen in a 2001 W3C Note, 
Representing vCard Objects in RDF/XML.

There are several problems that spring immediately from the existing RDF 
vCard specification, however:

* Most obviously, the specification uses all uppercase for RDF property 
names, resulting in vcard:ORG, for example. Admittedly asthetic, this 
style is out of step with the modern RDF convention of representing 
property names in example:mixedCase and representing type names using 
example:InitialUppercase.
* Also out of step with modern practices—reflecting the early date at 
which the the specification was created—the document opts for rdf:Seq 
and rdf:Bag to represent multiple property values, rather that using the 
newer rdf:List construct.
* Some properties have been invented out of necessity, but they are 
inconsistent. What is referred to as "Additional Names" in RFC 2426 
(vCard) has been changed to simply vcard:Other in the note, for example.
* Lastly, parts of the specification are simply broken. When explaining 
how to indicate that the type of a UID is a SSN, the specification 
indicates that a property should be given to an RDF literal: <vcard:UID 
vcard:TYPE="US-SSN">987-65-4320</vCard:UID>. Unfortunately, this is not 
only incorrect RDF+XML syntax, RDF has no facility for assigning 
property values to plain literals. This suggestion by the note simply 
will not work, full stop.

vCard and Directory

To find a comprehensive, rigorous, yet elegant solution for representing 
vCard in RDF, it is beneficial to take a step back and find out the 
context in which vCard was created. vCard was not created in a vacuum. 
The title of RFC 2426, vCard MIME Directory Profile, gives a hint about 
its status in life. vCard is formulated as a profile of a more general 
text/directory MIME storage format defined in RFC 2425, A MIME 
Content-Type for Directory Information , much in the same way that the 
FOAF Vocabulary Specification is an ontology of RDF. Most of the 
syntactical requirements of each vCard content line, and even many 
semantic entities such the language parameter type, are defined in RFC 
2425 (Directory) and only included by reference in RFC 2426 (vCard). 
Many individual semantic elements specific to RFC 2426 (vCard), such as 
the meaning of the N name components, are in turn derived from ITU-T 
X.520 and X-521.

The relationship of vCard to the Directory syntax and framework is 
especially interesting because of another similar format that some have 
attempted to convert to RDF+XML: RFC 2445, Internet Calendaring and 
Scheduling Core Object Specification (iCalendar). The iCalendar format 
has recently become popular by providing interoperability among such 
applications as Apple's iCal and Mozilla's Sunbird. Although RFC 2445 
(iCalendar) explicitly states that it is not technically a profile of 
RFC 2425 (Directory), it is "based upon the syntax" and "does reuse a 
number of elements" from that specification.

This close relationship among RFC 2425 (Directory), RFC 2426 (vCard), 
and RFC 2445 (iCalendar) becomes significant when formulating an RDF 
version of vCard. Some parts of an RDF vCard vocabulary might be better 
generalized in light of its relationship to a more general framework, 
Directory. Furthermore, the experience of efforts to create an RDF 
version of iCalendar, such as represented by the W3C's RDF Calendar 
Workspace, may have bearing on attempts to do the same with vCard and 
vice versa. (One obvious consideration that can be mentioned already is 
that the W3C RDF Calendar effort uses lowercase names for iCalendar RDF 
properties.)

The FOAF Name Problem

Related to this discussion is an ontological design issue that has been 
raised by the Friend of a Friend (FOAF) Project, which has a goal of 
"creating a Web of machine-readable pages describing people, the links 
between them and the things they create and do." The aforementioned FOAF 
Vocabulary Specification defines an RDF ontology that includes such 
types as foaf:Person and foaf:Organization. Also included are properties 
relevant to humans, such as foaf:name, foaf:mbox, foaf:homepage, and 
even foaf:myersBriggs and foaf:dnaChecksum (the latter of which is 
admitted to be "mostly a joke" to reiterate the purpose of FOAF in 
describing people).

FOAF obviously covers some of the same metadata ground as does vCard, 
especially with its foaf:name property. Also included are other 
name-related properties such as foaf:firstName, foaf:givenname, 
foaf:firstName, foaf:family_name, and foaf:surname. It has been noted 
that the syntax and semantics of these names seem haphazard, incomplete, 
and conflicting all at the same time. This FOAF name vocabulary issue 
has been under discussion since at least as far back as 2000, with no 
resolution yet reported and no progress apparent for the past few years 
as of 2007. One proposal, Names in Foaf, tries to combine the "adopt 
existing guidelines" and "roll your own" approaches by creating a single 
foaf:sortName property, the value of which is an ordered list of RDF 
classes, each representing a component of the name, such as in this example:

<foaf:Person>
<foaf:sortName xml:lang="es" rdf:parseType="Resource">
<rdf:li><foaf:FamilyName rdf:value="Smith"/></rdf:li>
<rdf:li><foaf:GivenName rdf:value="Maria"/></rdf:li>
<rdf:li><foaf:GivenName rdf:value="Ann"/></rdf:li>
<rdf:li><foaf:GivenName rdf:value="May"/></rdf:li>
<rdf:li><foaf:HonorificTitle rdf:value="Jr."/></rdf:li>
<rdf:li><foaf:HonorificTitle rdf:value="M.D."/></rdf:li>
</foaf:sortName>
</foaf:Person>
</rdf:RDF>

While this proposal makes no explicit reference to RFC 2426 (vCard) and 
its predecessor specifications, the similarity is striking. There are 
some detractions from this proposal, though:

* On the face of it, it's convoluted. The actual name components are 
indicated by property, but by class name. Furthermore, their 
relationship to the resource is that of an rdf:li property of a blank 
node value of the foaf:sortName property.
* Using the rdf:li property, here again, is outdated and, as used, not 
defined by the standard. Apparently the blank node value of 
foaf:sortName is being used as a sort of un-typed rdf:Seq, even though 
this is not made explicit. A more modern approach would be to use an 
rdf:parseType of Collection to create an ordered rdf:List, but the other 
issues listed here would still remain.
* Although the similarity to an existing standard, vCard, is obvious, 
there is no defined relationship that would make conversion between 
frameworks straightforward as well as provide insight useful in other 
projects such as W3C's RDF Calendar project.

As a reaction to the complexity of this approach, another suggestion, 
Names in Foaf - An alternate proposal has been to do away with all 
properties except foaf:name. In their place would appear a small set of 
properties—foaf:familiarName, foaf:informalName, foaf:formalName, and 
foaf:fullName—reflecting the use of names in different situations. These 
new properties would abandon any semantic identification of name parts, 
making it difficult to machine-process identification subcomponents as 
well as transfer data round-trip from existing standards.

A useful solution to representing human names in RDF should be as simple 
as possible. However, it should not be so simple as to lose semantics 
when transferring data from existing formats. Basing a solution on an 
existing standard would be ideal, but the transformation to RDF should 
be based upon a set of rules that would provide consistency as well as 
completeness, allowing straightforward data round-tripping between 
representations. One solution that meets these criteria is presented here.

Directory, vCard, and iCalendar in RDF

Analogous to Python script logic that is described in the W3C RDF 
Calendar document, is is possible to create a set of conceptual rules 
that can be used to transform the Directory, vCard, and iCalendar 
formats to RDF ontologies. We can start with the RDF ontology namespaces 
and recommended XML prefixes, which will each represent one of the 
profiles or pseudo-profiles of Directory, and will give an indication of 
how the work is to be divided up:

directory
http://globalmentor.com/namespaces/directory#
vcard
http://globalmentor.com/namespaces/directory/card#
icalendar
http://globalmentor.com/namespaces/directory/calendar#

A set of guiding rules might then start as follows:

* Each Directory type name shall be converted to an RDF property name 
within the respective ontology, using the syntactical form 
example:mixedCase. e.g. The vCard ORG type produces the vcard:org property.
* Defined parameters for a type shall also appear as RDF properties of 
the property value resource, using the name form example:mixedCase. e.g. 
The Directory LANGUAGE parameter would be indicated by a 
directory:language property of a value resource such as vcard:Adr.
* If a particular Directory type calls for a literal value yet specifies 
parameters for that value, or if a literal value appears in some other 
circumstance in which an RDF resource is necessary (such as as elements 
of rdf:List), a blank node shall be used as the property value with the 
literal value appearing as the property of the blank node's rdf:value 
property. e.g. The vCard ORG type, if language needs to be specified, 
would produce an vcard:org property with a blank node value that has 
both the directory:language property to indicate the language and also 
the rdf:value property to indicate the literal property value.
* If a particular Directory type specifies a structured value, an RDF 
class shall be used to contain the elements of that stuctured value with 
a name in the form example:InitialUppercase. e.g. The value of the vCard 
ADR type would be represented as a vcard:Adr class that is the value of 
a vcard:adr property.
* If a particular Directory type specifies a structured value, the 
components of which can be repeated but in which order is important, 
each structured value component shall be allowed to be represented 
either a single value or as an rdf:List value. If there is no official 
name defined for a particular structured value subcomponent, one shall 
be created that is as close as possible to the name used in the 
specification in referring to that component, using the 
example:mixedCase format. e.g. The subcomponent "Additional Name" of the 
vCard N type would appear as a vcard:additionalName property of the 
vcard:N class which would appear as a value of the vcard:n property. The 
value of the vcard:additionalName property could be single literal 
value; a blank node with its rdf:value set to the literal value (if, for 
instance, the directory:language needed to be specified for the value), 
or an rdf:List of blank nodes as just described (using a rdf:parseType 
of Collection in RDF+XML syntax, for example).

This produces the following RDF version of the name part of the vCard 
example introduced earlier, assuming that the properties are used to 
describe a foaf:Person:

<foaf:Person>
<card:fn>Dr. Mary Ann May Smith, Jr., M.D.</card:fn>
<card:n>
<card:N>
<card:familyName>Smith</card:familyName>
<card:givenName>Mary</card:givenName>
<card:additionalName rdf:parseType="Collection">
<rdf:Description rdf:value="Ann"/>
<rdf:Description rdf:value="May"/>
</card:additionalName>
<card:honoraryPrefix>Dr.</card:honoraryPrefix>
<card:honorarySuffix rdf:parseType="Collection">
<rdf:Description rdf:value="Jr."/>
<rdf:Description rdf:value="M.D."/>
</card:honorarySuffix>
</card:N>
</card:n>
<card:n>
<card:N>
<card:language>es</card:language>
<card:familyName>Smith</card:familyName>
<card:givenName>Maria</card:givenName>
<card:additionalName rdf:parseType="Collection">
<rdf:Description rdf:value="Ann"/>
<rdf:Description rdf:value="May"/>
</card:additionalName>
<card:honoraryPrefix>Dr.</card:honoraryPrefix>
<card:honorarySuffix rdf:parseType="Collection">
<rdf:Description rdf:value="Jr."/>
<rdf:Description rdf:value="M.D."/>
</card:honorarySuffix>
</card:N>
</card:n>
</foaf:Person>

Received on Monday, 26 March 2007 19:16:42 UTC