RE: Fractal communities: Was: Rich semantics and expressiveness

Dear Tim and Hans,

See below for a few observations.


Regards

Matthew West
Reference Data Architecture and Standards Manager
Shell International Petroleum Company Limited
Shell Centre, London SE1 7NA, United Kingdom

Tel: +44 20 7934 4490 Mobile: +44 7796 336538
Email: matthew.west@shell.com
http://www.shell.com
http://www.matthew-west.org.uk/

<snip>
> An agent plays a role in many different  
> overlapping communities.  When I tag a photo as being of my 
> car, or I  
> agree to use my car in a car pool, or when I register the car with  
> the Registry of Motor Vehicles, I probably use different  
> ontologies.   There is some finite  effort it would take to 
> integrate  
> the ontologies, to establish some OWL (or rules, etc) to link them.
> 
> - Everyone is encouraged to reuse other people's classes and  
> properties to the greatest extent they can.

MW: One of the counterbalances I find to this is that it is often
easier/cheaper to reinvent classes than find them (usually lots of 
versions) and decide if any of them really meet your needs. I know I 
see a lot of reinvention.

> - Some ontologies will already exist and by publicly shred by many,  
> such as ical:dtstart, geo:longitude, etc.  This is the single global  
> community.

MW: This is a pure guess, but if we take longitude as an example I
would be very surprised if there were not at least 100 publicly
available ontologies that defined longitude. To reduce this, one
of the things I think we need to do is to develop a sense of
authoritative source. We need to ask ourselves the question: who
"owns" this? What is *their* name/definition? This is something we
try to do with out own reference data. So we recognise ISO country
codes, rather than invent our own, we recognise a companies product
name/code when we buy their product, and the companies registered
name and number, rather than our abbreviation or version of it.

> - Some ontologies will be established by smaller communities of many  
> sizes.
> 
> Why do I think the structure should be will be fractal?  Clearly  
> there will be many more small communities, local ontologies, than  
> global ones. Why a 1/f distribution? Well, it seems to occur in many  
> systems including the web, and may be optimal for some problems.   
> That we should design for a fractal distribution of ontologies is a  
> hunch.  But it does solve the issue you raise.  Some aspects of the  
> web have been shown to be fractal already.
> 
> Here are some properties of the interconnections:
> 
> - The connections between the ontologies may be made after their  
> creation, not necessarily involving the original ontology designers.
> - There is a cost of connecting ontologies, figuring out how they  
> connect, which people will pay when and only when they need the  
> benefit of extra interoperability.
> - Sometimes when connecting ontologies, it is so awkward there is  
> pressure to change the terms that one community uses to fit 
> in better  
> with the other community. Again, a finite cost to make the change,  
> against a benefit or more interop.

MW: This is close to the dynamic view that I see. I see ontologies
start in isolation and then grow. Eventually, they bump into adjacent
ontologies that have also been growing (many will die of course).

MW: When enough ontologies overlap in a sufficiently annoying and
expensive way, an effort is undertaken to integrate these ontologies
to better support integration. This produces an increased centre of
gravity, and almost immediately small ontologies will spring up at 
the edges, and bigger ontologies will bump into other big ontologies.

MW: This process repeats, as far as I can see indefinitely. I observe
that - within Shell at least - the time between integrating at one
level and integrating at the next level up is about 10 years.
> 
> > Hence the need for a universal model as a common denominator. But  
> > it is striking that the word "interconnection" was used, rather  
> > than "integration". Interconnection reminds me of EAI [2], so hub- 
> > based or point-to-point, where Semantic Web integration (as I  
> > understand it) involves a web-based distributed data base.
> 
> Yes, if web-based means an overlapping set of many ontologies in a  
> fractal distribution.
> In his fractal tangle, there wil be several recurring patterns at  
> different scales.
> One pattern is a local integration within (say) an enterprise, which  
> starts point-point (problems scale as n^2) and then shifts with EIA  
> to a hub-and-spoke as you say, where the effort scales as N.    Then  
> the hub is converted to use RDF, and that means the hub then plugs  
> into a external bus, as it connects to shared ontologies.

MW: That same kinds of things will happen with the shared ontologies
as with the enterprise ontologies (moving to a hub and spoke model
requires an integrating ontology that at least spans the shared data).
> 
> 
> 
> >
> > Keeping in mind that, as I wrote before in this thread, 
> application  
> > systems store a lot of implicit data (or actually don't store  
> > them), the direct mapping of their data to the SW formats will  
> > cause more problems than its solves. They are based on their own  
> > proprietary data model, and these are unintelligible for other,  
> > equally proprietary, data models.
> >
> > The thing puzzling me is how the SW community can see what 
> I cannot  
> > see, and that is how on earth you can achieve what your Activity  
> > Statement says, without such a standard generic data model and  
> > derived standard reference data (taxonomy and ontology). But  
> > perhaps not many SW-ers bother about the need of universal  
> > integration, and are happily operating within their own subdomain,  
> > such as FOAF.
> 
> So the idea is that in any one message, some of the terms will be  
> from a global ontology, some from subdomains.

MW: Well if this means that we go out to the authoritative source for
reference data, rather than reinventing it, then that would be 
consistent with what I was saying above. But at the moment, the problem
I see is that just about everyone thinks they have the right to be
an authoritative source on whatever they please. This is not useful.

> The amount of data which can be reused by another agent will depend  
> on how many communities they have in common, how many 
> ontologies they  
> share.
> 
> In other words, one global ontology is not a solution to the 
> problem,  

MW: But interestingly, something that was the sum of the authoritative
sources I have been talking about, would be something like a global
ontology (but not the only one of course - just a dominant one).

> and a local subdomain is not a solution either.  But if each agent  
> has uses a mix of a few ontologies of different scale, that is forms  
> a global solution to the problem.

MW: I'm not convinced about this, though I will concede that 
authoritative sources might have small or large ontologies with variation
in the size and spread of their user base. However, I am quite confident
that we will only get there if we can find a way to reduce the use
of non-authoritative sources. Of course the web is the only chance we
have of being able to share these authoritative sources effectively.
> 
> Tim.
> 
> >
> > Can anybody enlighten me, at least by pointing to some useful links?
> >
> 
> ummm   http://www.w3.org/DesignIssues/Fractal.html  to which I might  
> add this explanation some time.
> 
> 
> 
> > Regards,
> > Hans
> >
> > PS The above does not mean that I have no faith in the SW. On the  
> > contrary, I preach the SW gospel. But I just want to understand  
> > where it is moving to.
> >
> > [1] http://www.w3.org/2001/sw/Activity
> > [2] http://en.wikipedia.org/wiki/Enterprise_Application_Integration
> >
> > ____________________
> > OntoConsult
> > Hans Teijgeler
> > ISO 15926 specialist
> > Netherlands
> > +31-72-509 2005
> > www.InfowebML.ws
> > hans.teijgeler@quicknet.nl
> >
> >
> >
> > --
> > No virus found in this outgoing message.
> > Checked by AVG Free Edition.
> > Version: 7.5.446 / Virus Database: 268.18.6/708 - Release Date: 02- 
> > Mar-07 16:19
> 
> 
> 

Received on Wednesday, 7 March 2007 08:21:05 UTC