W3C home > Mailing lists > Public > semantic-web@w3.org > November 2006

Re: a guy with 5 first names, from I18N comments on P3P, for vCard/RDF

From: Hamish Harvey <hamish@hamishharvey.com>
Date: Sun, 19 Nov 2006 13:38:43 +0000
Message-ID: <8f9aaf260611190538i5dbb3f4l75b272c9cb90a91b@mail.gmail.com>
To: "Arjohn Kampman" <arjohn.kampman@aduna-software.com>
Cc: "Dan Connolly" <connolly@w3.org>, "Ivan Herman" <ivan@w3.org>, semantic-web@w3.org

On 17/11/06, Arjohn Kampman <arjohn.kampman@aduna-software.com> wrote:

> And in case you're looking for name parsing challenges, I'd suggest you
> have a look at the following page :-)
>
>    http://en.wikipedia.org/wiki/Jan_Vennegoor_of_Hesselink

Being able to parse that automatically might be a challenge. By
looking at that page, though, a human could find enough information to
decide that his surname is "Vennegoor of Hesselink". They could then
encode it that way using the fields of even a parochially English
names data model, and expect a display name to be properly
reconstructed. It's essentially similar to a "double-barrelled"
surname.

It seems likely to me that one could never develop a data model and
processing software for names which did not fail with some names. One
could certainly never be sure one had done so since, as every good
Popperian knows, only a single counterexample would ever be needed to
prove you wrong. So perhaps part of the key is in having a flexible
enough system of manual overrides. For example: given values for a set
of fields, we can construct default display and sort forms of a name,
but we can also provide these explicitly, in which case the explicitly
provided version is used.

Even coming close to universality will surely require a load of
"cultural modules" in the ontology and the processing code.

Cheers,
Hamish

-- 
Hamish Harvey
Research Associate, School of Civil Engineering and Geosciences,
Newcastle University
Received on Sunday, 19 November 2006 13:38:56 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 21:45:12 GMT