[Use Case]: NRC-1 Information integration with rules and taxonomies from Boley, Harold on 2005-12-05 (public-rif-wg@w3.org from December 2005)

From: Boley, Harold <Harold.Boley@nrc-cnrc.gc.ca>
Date: Sun, 4 Dec 2005 20:16:11 -0500
To: <public-rif-wg@w3.org>
Message-ID: <E4D07AB09F5F044299333C8D0FEB45E90116DECF@nrccenexb1.nrc.ca>

** NRC-1 Information integration with rules and taxonomies

* Description

Government analysts, venture capitalists, or entrepreneurs want to
monitor the progress of business development in some region XY.
Facts about XY businesses are available from two Web sources, S1
and S2. While S1 contains detailed information, it has not been
updated since time T. S2 contains less information but continues to
be updated after T. As part of the information, a classification of
the sector or category of each business is given in the two sources,
using two respective taxonomies.

A Web Service is to create an integration view using all business
information from S1 except where it is overwritten by S2, adding
new entries for businesses only occurring in S2. For integrating
the classifications, corresponding sectors or categories need to
be determined and aligned in the taxonomies. 

* Implications

An instantiation of this use case was implemented with POSL rules
as NBBizKB [1] and tested in OO jDREW [2]. The need to construct
such integration rules through iterative refinement with human
experts implies the requirement of a human-readable syntax.

In this use case, the identity criterion for businesses across the
Web sources is a problem if no URI is provided or URI normalization
cannot be done: normalized phone numbers needed to be used in [1].
This implies the requirement to 'webize' the language with URIs and
interface it to the newest official URI normalization algorithm [3].

Given that the same business can be identified in both sources,
and assuming it is correctly classified w.r.t. their respective
taxonomies, an alignment between the two taxonomic classes can
be hypothetically established, which becomes the stronger the more
such business-occurrence pairs can be found in both sources. This
implies the requirement to combine rules with taxonomies [4] and
to permit uncertainty handling, as explored in Fuzzy RuleML [5].


[1] http://www.ruleml.org/usecases/nbbizkb
[2] http://www.jdrew.org/oojdrew
[3] http://www.gbiv.com/protocols/uri/rfc/rfc3986.html
[4] http://rewerse.net/deliverables/m12/i3-d3.pdf
[5] http://image.ntua.gr/FuzzyRuleML

Received on Monday, 5 December 2005 01:19:19 UTC