W3C home > Mailing lists > Public > www-rdf-logic@w3.org > February 2001

RE: universal languages

From: <rapoell@notionsystem.com>
Date: 7 Feb 2001 22:44:42 -0800
Message-ID: <20010208064442.13483.cpmta@c004.sfo.cp.net>
To: www-rdf-logic@w3.org
At TNO we are actually working on the implementation of a semantic network based on the ideas I developed with Notion System (http://www.notionsystem.com). 
One of the aspects of this SN will be the capability of automatic mapping unstructured information into elements (nodes, arc or attributes) of the network. We do this through a temporary  (local) network based only on the information within the document and mapping this to the existing network.

For the construction of the temp network we use "traditional" techniques (statistical, syntactical, latent semantic indexing etc) for the mapping to the "real" network we'll use these same techniques extended with a technique based on some parameters I defined for semantic networks (network distance, semantic distance and semantic similarity). These parameters can be based on "real" elements of the network or on "virtual" elements that come from the inference engine (in particular virtual associations/arcs). 

For each of the elements of the temporary network there will be a mapping result with one of the following states: identification, non-identification or ambiguity. The "conclusions" for the mapping of the first two situations are straightforward: create the link respectively create a new node or a new kind of arc.
When ambiguity doesn't allow us to make a clear distinction between possible targets there are two possible actions: we either create the information for all of the targets or we don't (this depends on a threshold value). When we do, this information is associated with a "should be revisited" attribute that indicates that somewhere in the future this should be rechecked.
In order to be able to maintain all this ambiguous information (and the non-ambiguous) I've proposed a solution where each node in the network becomes an "information bearing agent". With tasks like checking on a regular basis (in fact modeled time frame actions) its internal data integrity. For the ambiguous information a re-mapping is carried out that might not lead to the same conclusion because the network changed.

You can have a look at some of these ideas as expressed in a presentation I gave a few months ago for the European Commission in the Semantic Web Technologies Workshop (http://www.cordis.lu/ist/ka3/iaf/swt_presentations/swwspoell.htm)

We do use (internally at least) UUID's (which are also URI's) and for retrieval temporary PUID's (as not information will be available for everyone) so in that point I agree with Stefan. With regards to "externally" provided ID's (or URI's) the same technique is used as for information coming from e.g. databases: there is a mapping (read attribute or association) between the external source ID and the internal UUID (once identification is established).

Greetings

Ronald Poell

__________________________________________
Get your free domain name and domain-based
e-mail from Namezero.com
New! Namezero Plus domains now available.
Find out more at: http://www.namezero.com
Received on Thursday, 8 February 2001 01:45:18 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:52:38 GMT