RE: Fractal communities: Was: Rich semantics and expressiveness from Glendinning, Ian on 2007-03-05 (semantic-web@w3.org from March 2007)

From: Glendinning, Ian <Ian.Glendinning@intergraph.com>
Date: Mon, 5 Mar 2007 09:25:57 -0600
To: "Hans Teijgeler" <hans.teijgeler@quicknet.nl>, "Tim Berners-Lee" <timbl@w3.org>
Cc: "SW-forum" <semantic-web@w3.org>, "West, Matthew" <matthew.west@shell.com>
Message-ID: <45765D73CA48A247897BB33D7667A71301CB5368@US-MAIL.ingrnet.com>
Hans, Tim, Matthew, et al,

As a long term active member of the ISO-15926 community, I also support
a "Fractal View" (though I've not closely followed the most recent W3C
developments in this area). The two views are not exclusive.

The key word is "view".

The only effective "global ontology" will simply be the superset of all
the other overlapping "fractal" ontologies.

One aspect in the ISO-15926 space that I have supported most strongly is
our "Templates" work, and most important that our schema fragment
templates form part of an extensible virtual library. There, a key
feature is the possibility that whilst we are pulling together an
information set "for a purpose" (a view, for a given business domain)
the schema fragments of the component parts need not all form part of
the same super-schema or the one global ontology. (The "assembly" model
for that view may be nested hierarchically, but the component parts may
simply be part of many overlapping ontologies.)

So long as the schema (and identity) of each fragment is known (or
sufficiently known for the purpose, not all schemas are created equal)
then we have a workable and manageable situation IMHO, (subject to
caveats about performance etc in applications in given domains, and even
business domains are really just "communities" drawn together for a
given common set of business needs - but those same members are also
part of other "communities").

I first saw the connection between our ISO-15926 Templates and the wider
"freely evolving" web ontologies, back in our late 90's KnoW days, when
I remember bringing to the table Jorn Barger's "Fractal Thicket" model,
from the 80's, as an antidote to simple ontologies.

I think we are right to promote key aspects of the ISO-15926 work
(Templates & Reference Data approaches, and the IIDEAS information
architecture concepts) to the wider community, but the idea that the
explicit "generic" schema will satisfy all needs, or could do so in any
practical manageable sense, is not necessarily the most valuable aspect
to promote.

Hope that is considered constructive.
 
Regards
Ian Glendinning

-----Original Message-----
From: Hans Teijgeler [mailto:hans.teijgeler@quicknet.nl] 
Sent: Sunday, March 04, 2007 4:49 AM
To: 'Tim Berners-Lee'
Cc: 'SW-forum'; 'West, Matthew'
Subject: RE: Fractal communities: Was: Rich semantics and expressiveness

Hi Tim,

In the eighties we, the EPC contractors (EPC = Engineering, Procurement,
Construction) involved in the process industries (oil & gas, chemical,
etc),
started working on sharing data amongst applications used within our own
company. Handing over the data of any newly built plant to the
Owner/Operator (e.g. Shell, DuPont) in an electronic format that makes
sense, was and still is quite a problem.

Then the ecomomy started to globalize, and the projects grew in size and
complexity. As a result our business has become rather "promiscuous",
i.e.
we work together with many companies around the globe, including our
competitors, in a different mix of partners and roles for each project.
Given the fact that the larger EPC contractors work on some 1000 to 2000
projects, very small to very large, at any time, you can see the
problem.
Our "community", using your term, spreads over the entire globe and over
thousands of companies with umpty interrelated disciplines, each doing
their
part, and each producing and requiring information. In the end there is
only
ONE plant, materialwise fully integrated. But not so the representation
of
its information.

The industry then picked up the concept of gathering and storing
lifecycle
information. In 1988 we started with data modelling, in 2003 it finally
became an ISO standard (ISO 15926-2). The reference data (ISO
15926-4)(ontology) I mentioned is the result of work by hundreds of
domain
experts. We are working now on a Semantic Web implementation of all this
(ISO 15926-7)[1].

The contents of any domain-specific ontology, together with the data
model,
allow for a standards-based representation of lifecycle information, at
any
time allowing for true integration. That "standards-based" includes the
applicable W3C standards.

Your fractals-based approach is fine, but does not solve the problems of
a
global economy. It is like the situation that a community is
communicating
in some natural language, say Swahili, and no one can speak English (or
one
of the other most-used languages). That community may be utterly happy,
but
cannot participate in the global economy (or at least it does not help
much
that they don't speak English). 

Besides that, "communities" of seemingly the same nature often have a
different scope of activities, sometimes small (but annoying), sometimes
large. Often this is caused by different legislation, education system,
habits, etc. They then think that they cannot cooperate, and they start
standardization for their own "parish".

I think that we should strive for a generic information representation
standard, including an upper ontology that has the blessing of the
experts
in that field, that plays the same role for data as English plays for
representations in a natural language. RDF and OWL are the *means* to
implement such a standard.

Our approach is that we map the data of application systems at the
source
into the RDF/XML format, contentwise defined in ISO 15926-7, and store
that
in a standard triple store with a standard API. We designed means to
make
such a triple store a member of a confederation (say per project), and
can
query (SPARQL) multiple triple stores inside such a confederation,
depending
on access rights. If necessary the query results are mapped to the
internal
format of the application system. Mapping is done only for data that are
owned and that need to be shared.

Are we there already? By no means! We are working on that on two
development
projects [2][3]. But as the Chinese say: "even the longest journey takes
a
first step". Although, "first step"?.... we are 19 years underway by
now,
and we see our destination at the horizon, also thanks to the efforts of
your organization.

Regards,
Hans

[1] http://www.InfowebML.ws
[2] http://www.fiatech.org/projects/idim/iso15926.html 
[3] http://www.posccaesar.com/


-----Original Message-----
From: Tim Berners-Lee [mailto:timbl@w3.org] 
Sent: Sunday, March 04, 2007 0:28
To: Hans Teijgeler
Cc: SW-forum; West, Matthew
Subject: Fractal communities: Was: Rich semantics and expressiveness


On 2007-03 -03, at 05:19, Hans Teijgeler wrote:

> Folks,
>
> In this context I would like to bring up something that keeps puzzling

> me.
>
> The W3C Semantic Web Activity Statement [1] starts with:
>
> "The goal of the Semantic Web initiative is as broad as that of the
> Web: to create a universal medium for the exchange of data. It is 
> envisaged to smoothly interconnect personal information management, 
> enterprise application integration, and the global sharing of 
> commercial, scientific and cultural data. Facilities to put machine- 
> understandable data on the Web are quickly becoming a high priority 
> for many organizations, individuals and communities."
>
> This is great, and it is what we strive for. But it is puzzling how 
> this can ever be achieved without a universal, generic, data-driven 
> model and standard data to drive that model. What I see happening is 
> that everybody can and often does invent instances of owl:Class and 
> owl:ObjectProperty on-the-fly, and then seems to expect that DL will 
> be the band-aid that solves all integration problems. In order to 
> assist the reasoners all sorts of qualifications are added (re 
> OWL1.1), but to me it seems that when this is done, actually a (rather

> private) data model is created again.
>
> Above statement envisages the "smooth interconnection" of a plethora 
> of totally different application domains. That is wise, because we 
> live in one integrated universe (domain), and nobody can dictate where

> one subdomain stops and the other begins.

Rather than 'domain of discourse' , or set of things considered, I think
of
'community', set of agents communicating using certain terms.  When one
thinks in terms of domain of discourse, one tends to conclude that
everyone
who talk at all about a car (say) has cars in their domain of discourse
and
so everyone must share the model which includes the single class Car.

It isn't like that though.  An agent plays a role in many different
overlapping communities.  When I tag a photo as being of my car, or I
agree
to use my car in a car pool, or when I register the car with the
Registry of
Motor Vehicles, I probably use different  
ontologies.   There is some finite  effort it would take to integrate  
the ontologies, to establish some OWL (or rules, etc) to link them.

- Everyone is encouraged to reuse other people's classes and properties
to
the greatest extent they can.
- Some ontologies will already exist and by publicly shared by many,
such as
ical:dtstart, geo:longitude, etc.  This is the single global community.
- Some ontologies will be established by smaller communities of many
sizes.

Why do I think the structure should be will be fractal?  Clearly there
will
be many more small communities, local ontologies, than global ones. Why
a
1/f distribution? Well, it seems to occur in many  
systems including the web, and may be optimal for some problems.   
That we should design for a fractal distribution of ontologies is a
hunch.
But it does solve the issue you raise.  Some aspects of the web have
been
shown to be fractal already.

Here are some properties of the interconnections:

- The connections between the ontologies may be made after their
creation,
not necessarily involving the original ontology designers.
- There is a cost of connecting ontologies, figuring out how they
connect,
which people will pay when and only when they need the benefit of extra
interoperability.
- Sometimes when connecting ontologies, it is so awkward there is
pressure
to change the terms that one community uses to fit in better with the
other
community. Again, a finite cost to make the change, against a benefit or
more interop.

> Hence the need for a universal model as a common denominator. But it 
> is striking that the word "interconnection" was used, rather than 
> "integration". Interconnection reminds me of EAI [2], so hub- based or

> point-to-point, where Semantic Web integration (as I understand it) 
> involves a web-based distributed data base.

Yes, if web-based means an overlapping set of many ontologies in a
fractal
distribution.
In his fractal tangle, there wil be several recurring patterns at
different
scales.
One pattern is a local integration within (say) an enterprise, which
starts
point-point (problems scale as n^2) and then shifts with EIA  
to a hub-and-spoke as you say, where the effort scales as N.    Then  
the hub is converted to use RDF, and that means the hub then plugs into
a
external bus, as it connects to shared ontologies.

> Keeping in mind that, as I wrote before in this thread, application 
> systems store a lot of implicit data (or actually don't store them), 
> the direct mapping of their data to the SW formats will cause more 
> problems than its solves. They are based on their own proprietary data

> model, and these are unintelligible for other, equally proprietary, 
> data models.
>
> The thing puzzling me is how the SW community can see what I cannot 
> see, and that is how on earth you can achieve what your Activity 
> Statement says, without such a standard generic data model and derived

> standard reference data (taxonomy and ontology). But perhaps not many 
> SW-ers bother about the need of universal integration, and are happily

> operating within their own subdomain, such as FOAF.

So the idea is that in any one message, some of the terms will be from a
global ontology, some from subdomains.
The amount of data which can be reused by another agent will depend on
how
many communities they have in common, how many ontologies they share.

In other words, one global ontology is not a solution to the problem,
and a
local subdomain is not a solution either.  But if each agent has uses a
mix
of a few ontologies of different scale, that is forms a global solution
to
the problem.

Tim.

>
> Can anybody enlighten me, at least by pointing to some useful links?
>

ummm   http://www.w3.org/DesignIssues/Fractal.html  to which I might  
add this explanation some time.



> Regards,
> Hans
>
> PS The above does not mean that I have no faith in the SW. On the 
> contrary, I preach the SW gospel. But I just want to understand where 
> it is moving to.
>
> [1] http://www.w3.org/2001/sw/Activity
> [2] http://en.wikipedia.org/wiki/Enterprise_Application_Integration
>
> ____________________
> OntoConsult
> Hans Teijgeler
> ISO 15926 specialist
> Netherlands
> +31-72-509 2005
> www.InfowebML.ws
> hans.teijgeler@quicknet.nl
>

-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.446 / Virus Database: 268.18.6/709 - Release Date:
03-Mar-07
8:12
Received on Monday, 5 March 2007 15:26:07 UTC