RE: Why Literals should be unique and why this is a serious issue

Hi All,

 

Is there any compelling reason why this W3C(!) forum doesn't work in HTML,
but in Plain Text where this easily plays havock with RDF or OWL listings?
With messages like the one below it becomes a mess, no matter how much care
is taken to create something readable within these narrow margins.

 

Regards,

Hans

 

PS I have cleaned up the mess in HTML, but that won't help much, I am
afraid.

 

-----Original Message-----
From: semantic-web-request@w3.org [mailto:semantic-web-request@w3.org] On
Behalf Of Hans Teijgeler
Sent: Saturday, November 19, 2005 12:06 PM
To: 'Andreas Andreakis'
Cc: 'semantic-web at W3C'; 'Richard Newman'; Paap, Onno; Price, David
Subject: RE: Why Literals should be unique and why this is a serious issue

 

 

Hi Andreas,

 

Here a contribution from a field that cannot be "lazy" as you mentioned.
This is the field of lifecycle information integration for facilities. Our
work entails setting up "confederations" of MANY triple stores of systems,
groups, companies involved in that life cycle.

 

What we do is:

*   each resource gets a unique "SystemID" (the ID allocated to a resource
within your system, like a primary key in an RDBMS)

*   that SystemID stays with the resource forever (a kind of "resource DNA")

*   since that SystemID is prefixed with the URI of that system, the
combination is unique on the Internet

*   names like "Tiger Woods" are no good substitute for this DNA, because
people can (and do) change names in their lifetime (this also applies to the
somewhat strange habit of  identifying a person with his/her e-mail address)

 

About Literals the following:

*   Literals are, from a modelling point of view, classes. Any Literal class
has zillions of  members (you look at some of them)

*   That's why we model them as the owl:Class "XmlSchemaLiteral" with
subClasses for each datatype (e.g."XmlSchemaString"), and subsubClasses for
each particular string, integer, etc. They have a Property "content". That
content has the actual value expressed in rdf:datatype terms

*   Advantage of this approach is that you can easily define translations
between any two of such classes, and you have to do it only once for each
pair in a certain context

*   This approach obviously creates an overhead, but when you take the
global Semantic Web (not just a US/UK English one) serious, then such
translations are important

 

An example of this in OWL Full (the prefix XSST is an acronym for the class
type (here: XmlSchemaSTring)):

 

<owl:Class rdf:ID="XSST-487832">

      <rdfs:subClassOf
rdf:resource="http://www.15926.org/dm#XmlSchemaString"/>

      <rdf:type rdf:resource="http://www.15926.org/rd#LANG-347001">

    <dm:content
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">pump</dm:content>

</owlClass>

 

<owl:Class rdf:ID="XSST-548388">

    <rdfs:subClassOf
rdf:resource="http://www.15926.org/dm#XmlSchemaString"/>

    <rdf:type rdf:resource="http://www.15926.org/rd#LANG-347012">

    <dm:content
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">bomba</dm:content>

</owlClass>

 

where LANG-347001 is defined as "English" and LANG-347012 as "Italian". 

A Property "translatedTo" does the rest. 

 

In case we want to define the context we use our "templates", which are

standard n-ary relations. 

 

Regards,

Hans

 

_______________________ 

Hans Teijgeler

ISO 15926 specialist

www.InfowebML.ws

hans.teijgeler@quicknet.nl

phone +31-72-509 2005      

________________________________________-----Original Message-----

From: semantic-web-request@w3.org [mailto:semantic-web-request@w3.org] On

Behalf Of Andreas Andreakis

Sent: Saturday, November 19, 2005 11:20 AM

To: Richard Newman

Cc: semantic-web at W3C

Subject: Re: Why Literals should be unique and why this is a serious issue

 

 

this is a good example.

 

The scenarion you describe is a matter of modelling and at last related 

to resource-identification. For instance, If you describe an Ontology 

with persons having the same name, you simply have to add

more inverse functional properties (in owl) to identify persons. And 

dont forget that in terms of resource identification it is a 

prerequirement to assume a specific class and not only literals, since 

literals themselfs can not talk detailed about a resource they describe. 

What does "Tiger" mean ? OS, Animal oder somekind of other Product ? or 

what does "David Green" alone mean ? a company name oder a person name ?

 

The FOAF Ontology uses for instance a combination of 2-3 

inv-funct-properties do identify persons, where the email is of of those.

 

 

But anyway, we have still not solved the dublication-problem and talking 

around will not bring us forward. So I ask again. And Im really looking 

forward to suggestions from you guys.

How can we prevent this ?

People are lazy and will not search if others have created something 

similar. Higher levels of abstractions can prevent dublications, but we 

need a unified specification on this ! There are already Implementations 

that ignore rdf:IDīs of resources, the most common example is FOAF. FOAF 

says in its specification, not to include rdf:IDīs, so where will this 

lead us ? If the one uses IDīs and ther other inverse-funct-properties ??

 

cheers,

Andreas

 

 

 

Richard Newman schrieb:

 

> Let's have a counter-example.

> 

> I know two people named David Green. Almost no literal-valued  

> property can really be termed inverse-functional: even genetic code  

> sequences can be shared (between twins, for example). Certainly,  

> terming names ("Tiger Woods") as IFPs (your more "fundamental  

> problem") doesn't work.

> 

>> So, in a relational Database this problem would have never arrised.  

>> So why canīt be do the same in Ontologies ?

> 

> 

> Well, as has been pointed out, we can -- IFPs. We don't do so very  

> often because our assertions have global scope, and I *know* that the  

> two David Greens are separate individuals.

> 

> Relational databases rarely choose to deal with the possibilities of  

> integrating data from a dozen sources.

> 

> -R

> 

> 

 

 

 

Received on Saturday, 19 November 2005 11:27:33 UTC