W3C home > Mailing lists > Public > public-lod@w3.org > December 2010

Re: Any reason for ontology reuse?

From: ProjectParadigm-ICT-Program <metadataportals@yahoo.com>
Date: Sun, 5 Dec 2010 07:01:06 -0800 (PST)
Message-ID: <568610.85023.qm@web113810.mail.gq1.yahoo.com>
To: Martijn van der Plaat <martijn@profec.nl>
Cc: Percy Enrique Rivera Salas <privera.salas@gmail.com>, public-lod@w3.org, Martin Hepp <martin.hepp@ebusiness-unibw.org>, Semantic Web <semantic-web@w3.org>, Toby Inkster <tai@g5n.co.uk>
Dear Martijn,

There is a definite need for standards and methodologies to construct ontologies.

The GoodRelations ontology endorsed by Google is not the issue. The problem I see reappearing in all these threads is something I first detected in the mid eighties and it has never gone away.

When dealing with formal systems which aim to provide semantics, the linguists and computer scientists have always sat at opposite ends with the mathematicians, like myself, stuck in the middle.

It is and always be inherently very difficult to devise formal systems for natural languages, because they lack universal structures to adequately allow for construction of tools for natural language processing.

There was a time when it seemed that most computer scientists unwittingly and unknowingly seemed to espouse positivist and logical positivist approaches to the budding field of computer sciences, sometimes called informatics, or informatica in the Netherlands.

Here is the catch, if you use the scientific method approach to formally model language in order to enable inclusion of semantics on the Internet, you will run into this wall I described of the impossibility of one on one mapping from one language to the other.

Just take a look at Language Tools on the Google page and try these out, more often then not the returned translations even of snippets of text are lousy.

The EU CLARIN and similar programs stress a very important point there must be standards for areas that obviously not fall into the category trade, commerce or science.

The directory service you mention does not exist yet simply because it ( the construction of ontologies for global use)  is an unstructured endeavor.

Where there are global endeavors e.g. the FAO programs for semantic technologies in the fields of food and agriculture, these linguistic problems are manifesting themselves in the construction of ontologies.

Dictionary type descriptions of individual words then become the only viable solution for ontology construction and dictionary construction is based on linguistic principles.

So what will  it eventually lead to? A system of federated ontologies, some accepted based on popular use AND endorsement by powerful institutions, others created in accordance with standards and methodologies and then there is the whole middle field of ontologies, good and bad, built according to other criteria or without popular following or endorsement.

In the end we will end up with directory services which will feature the first and second categories but with omission of the middle field.

And because companies like Google want to be present in all possible domestic markets which are dominated by national languages, standards and methodologies will inevitably ne created for ontology construction in fields where the linguistic challenges are the greatest.

It is not without irony that Google was created by mathematicians with a knack for languages, and I fully expect them be one of the first institutions to start providing exactly such a directory service based on inclusion of the first and second categories.
 

Milton Ponson
GSM: +297 747 8280
PO Box 1154, Oranjestad
Aruba, Dutch Caribbean
Project Paradigm: A structured approach to bringing the tools for sustainable development to all stakeholders worldwide by creating ICT tools for NGOs worldwide and: providing online access to web sites and repositories of data and information for sustainable development

This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail.


--- On Sun, 12/5/10, Martijn van der Plaat <martijn@profec.nl> wrote:

From: Martijn van der Plaat <martijn@profec.nl>
Subject: Re: Any reason for ontology reuse?
To: "ProjectParadigm-ICT-Program" <metadataportals@yahoo.com>
Cc: "Percy Enrique Rivera Salas" <privera.salas@gmail.com>, public-lod@w3.org, "Martin Hepp" <martin.hepp@ebusiness-unibw.org>, "Semantic Web" <semantic-web@w3.org>, "Toby Inkster" <tai@g5n.co.uk>
Date: Sunday, December 5, 2010, 10:13 AM

Hi all,
As a master student Information Sciences at the VU Amsterdam and interested in LOD (and SemWeb as a whole) I recently started reading this mailinglist. The first discussion I tried to follow was the "is 303 really necessary?" Topic initiated by Ian Davis. Well, after response nr. 10^2 I was totally confused and almost decided to unsubscribe!


OK, now back on topic. I like both the explanation from Toby and Martin. But the problem of knowing the popularity of the *whole* ontology, mentioned somewhere in the comments, is IMO not interesting since in most cases a data-publisher will use only parts of the ontology. 


Also I dont believe in the added NL analogy by claiming that publishing ideas in more than one type of ontology is a good thing. Why? Eg. The fact that more and more datapublishers are using GoodRelations is already a great development, imagine that every commercial sector is introducing their own ontology, except the inherited ontologies of in this case GR. I dont believe in automated ontology alignment technologies from this point of view. Maybe translating *instances* of a certain popular ontology to your own language is a better analogy?

What I personally miss in the current linked data development is a service where I can search existing properties/classes when publishing structured data. Sometimes it takes 5/10min to find (1) the right ontology and (2) the right property/class, or in worst case I end up with nothing, but this is maybe a discussion for an other thread.

Cheers,

Martijn van der Plaat
Op 4 dec 2010 21:30 schreef "ProjectParadigm-ICT-Program" <metadataportals@yahoo.com>:


Dear Martin,

Ad Rule 1. Is true if we can assume the builder of the ontology has built something which is good according to some measurable criteria. For this we have standards and procedures to arrive at standardized sets. We cannot be sloppy when building ontologies e.g. for civil engineering, aerospace, pharmacy, medicine, biodiversity or defense technologies, so why should we then allow sloppy ontologies for most other fields?


Ad Rule 2. More popular and better quality yes , more popular but probably of less quality no. But who makes the judgment calls? The collective of users is never a good judge.

Rules 3 and 4 presuppose that somehow the person building his own ontology or who must pick one from those available has the tools to determine which is best, or how to make a good ontology.


Knowledge and information depend on
 accurate, for scientific reasons unambiguous recording in natural language, which requires accurate terminology, definitions etc.

The same rigid structures that dictate natural language vocabularies an dictionaries have to apply to ontology engineering as well.


I can safely assume that most of the subscribers to our lists have the intuitive skills to know good from bad ontologies and what is the right practical approach to building a good ontology, but when semantic technologies go mainstream, a lot of people will join the fray who don't, so somewhere along the line some standardization and formal procedures must be introduced.


Dictionaries exist for a reason, and they are made based on corpora and lexicological tools by specialized linguists, and for a good reason, according to standards and standardized procedures for arriving at such.


And since ontologies are structured mirror images of natural language domains it is
 inevitable and inescapable that good standard ontologies should reflect this as well.

Like in many fields of science and engineering, good rules of thumb are always created by specialists from the same fields who can reduce many rules, standards, based on expert experience to a few rules.


It is this innate ability to create rules of thumb that must be captured in procedures for ontology engineering.

No easy task, but not impossible, and not without standards and methodology, but as bare-bones as possible, because languages are dynamic and flexible and ever evolving.


Milton Ponson
GSM: +297 747 8280
PO Box 1154, Oranjestad
Aruba, Dutch Caribbean
Project Paradigm: A structured approach to bringing the tools for sustainable development to all stakeholders worldwide by creating ICT tools for NGOs
 worldwide and: providing online access to web sites and repositories of data and information for sustainable development

This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail.



--- On Sat, 12/4/10, Martin Hepp <martin.hepp@ebusiness-unibw.org> wrote:


From: Martin Hepp <martin.hepp@ebusiness-unibw.org>
Subject: Re: Any reason for ontology reuse?
To: "Toby Inkster"
 <tai@g5n.co.uk>
Cc: "Percy Enrique Rivera Salas" <privera.salas@gmail.com>, public-lod@w3.org, "Semantic Web" <semantic-web@w3.org>

Date: Saturday, December 4, 2010, 1:07 PM
>
> Simple rules:
>
> 1. It is better to use an existing ontology than inventing your own.
> 2. It ...

Received on Sunday, 5 December 2010 15:01:42 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:30 UTC