Re: Fractal communities: Was: Rich semantics and expressiveness from Richard Cyganiak on 2007-03-07 (semantic-web@w3.org from March 2007)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Wed, 7 Mar 2007 17:25:46 +0100
To: Golda Velez <w3@webglimpse.org>
Cc: semantic-web@w3.org
Message-Id: <90A5B1C5-E09C-4E88-8934-CCA44BC5CE99@cyganiak.de>
Golda,

On 7 Mar 2007, at 17:37, Golda Velez wrote:
> - about the ISBN: stuff.  Of course we should be able to use those
> identifiers, but yes they need a namespace otherwise someone will be
> referring to the "International Society for Butter Nuts".  That can  
> be solved
> if the info: people or someone else offers to provide a stable URL
> referencing them.  Like, info://ns.info/2007/isbn/NNNNNNN
> If the namespace owner is trusted and says they will not reinvent  
> the meaning
> of that URL, that's good enough.  It would be nice if w3 or one  
> other central
> org was in the habit of providing such urls.  Note a web page or HTTP
> anything is not necessary if its not a full ontology - its just a  
> URL to ref
> a concept!

If it's OK with you that the URI is not resolvable, then why not just  
use urn:isbn:XXXXX, it's a standard [1].

Of course it would be nice if someone would provide HTTP URIs that  
actually return useful data. We tried to do this with the RDF Book  
Mashup [2], using URIs like this:

http://www4.wiwiss.fu-berlin.de/bookmashup/books/006251587X

It uses the Amazon API to retrieve some data about the book. But as  
you say, a highly trusted namespace owner would be better. I don't  
think this is W3C's job though. Setting up an HTTP server for this  
kind of stuff isn't *that* hard.

> - tools to go back and forth from the RDF world for us old sql  
> programmers.  I
> have a semi-semantic web type app meant for joe user that has 8000+
> categories and 80000+ items that can all relate to each other, and  
> needs
> flexible and fast ways to display subsets.  So I have it all in  
> mysql.  There
> is a table with a RELATION field with a small set of allowed  
> relations.  It
> seems to me there should eventually be simple tools for porting  
> this kind of
> data into and out of RDF.   Not everyone wants to use RDF as the  
> back end all
> the time..and not all the features of ontologies are needed for all
> applications, but any valid semantic data should be able to be  
> exposed for
> use.

Very true, and a number of applications exist for this purpose. See  
[3] for a list. (I'm one of the authors of D2RQ and D2R Server, which  
I think are fairly good at solving this kind of problem, though not  
yet as simple as I'd like. Check out [4] for a bibliography database  
of 800k+ records mapped to RDF using D2R Server.)

Yours,
Richard

[1] http://www.ietf.org/rfc/rfc3187.txt
[2] http://www4.wiwiss.fu-berlin.de/bookmashup/
[3] http://esw.w3.org/topic/RdfAndSql
[4] http://www4.wiwiss.fu-berlin.de/dblp/


> - it seems to me some relations are pretty universal because they are
> abstract.  I'd like to see a lot of reuse in different ontologies  
> of things
> like
> 	IS_LOCALIZATION_OF
> 	IS_PERSONALIZATION_OF
> 	DEPENDS_ON
> 	CREATED_BY
> and so on, these types of words are abstractions humans can relate  
> to totally
> different fields, and could be used pretty universally, no?
>
> Well, as usual I've probably missed the point, but anyway once all  
> the semweb
> stuff is simple enough for me to use then you know you have something
> happening.
>
> Last point, probably showing my lack of research - so where is the
> registration place with simple categories for all semweb related  
> things?
>
> bye all
>
> --Golda
>
> ps I wrote a more in-depth (even than the above lengthy mail)  
> explanation of
> these kind of points at http://webglimpse.net/Mapping.pdf . mainly  
> its an
> argument for the info: people or someone trusted to make versioned,
> extensible concept urls for standard concept schemes that haven't been
> webified yet.
>
>
>
>
> On Wednesday 07 March 2007 01:20, matthew.west@shell.com wrote:
>> <snip>
>>> An agent plays a role in many different
>>> overlapping communities.  When I tag a photo as being of my
>>> car, or I
>>> agree to use my car in a car pool, or when I register the car with
>>> the Registry of Motor Vehicles, I probably use different
>>> ontologies.   There is some finite  effort it would take to
>>> integrate
>>> the ontologies, to establish some OWL (or rules, etc) to link them.
>>>
>>> - Everyone is encouraged to reuse other people's classes and
>>> properties to the greatest extent they can.
>>
>> MW: One of the counterbalances I find to this is that it is often
>> easier/cheaper to reinvent classes than find them (usually lots of
>> versions) and decide if any of them really meet your needs. I know I
>> see a lot of reinvention.
>>
>>> - Some ontologies will already exist and by publicly shred by many,
>>> such as ical:dtstart, geo:longitude, etc.  This is the single global
>>> community.
>>
>> MW: This is a pure guess, but if we take longitude as an example I
>> would be very surprised if there were not at least 100 publicly
>> available ontologies that defined longitude. To reduce this, one
>> of the things I think we need to do is to develop a sense of
>> authoritative source. We need to ask ourselves the question: who
>> "owns" this? What is *their* name/definition? This is something we
>> try to do with out own reference data. So we recognise ISO country
>> codes, rather than invent our own, we recognise a companies product
>> name/code when we buy their product, and the companies registered
>> name and number, rather than our abbreviation or version of it.
>>
>>> - Some ontologies will be established by smaller communities of many
>>> sizes.
>>>
>>> Why do I think the structure should be will be fractal?  Clearly
>>> there will be many more small communities, local ontologies, than
>>> global ones. Why a 1/f distribution? Well, it seems to occur in many
>>> systems including the web, and may be optimal for some problems.
>>> That we should design for a fractal distribution of ontologies is a
>>> hunch.  But it does solve the issue you raise.  Some aspects of the
>>> web have been shown to be fractal already.
>>>
>>> Here are some properties of the interconnections:
>>>
>>> - The connections between the ontologies may be made after their
>>> creation, not necessarily involving the original ontology designers.
>>> - There is a cost of connecting ontologies, figuring out how they
>>> connect, which people will pay when and only when they need the
>>> benefit of extra interoperability.
>>> - Sometimes when connecting ontologies, it is so awkward there is
>>> pressure to change the terms that one community uses to fit
>>> in better
>>> with the other community. Again, a finite cost to make the change,
>>> against a benefit or more interop.
>>
>> MW: This is close to the dynamic view that I see. I see ontologies
>> start in isolation and then grow. Eventually, they bump into adjacent
>> ontologies that have also been growing (many will die of course).
>>
>> MW: When enough ontologies overlap in a sufficiently annoying and
>> expensive way, an effort is undertaken to integrate these ontologies
>> to better support integration. This produces an increased centre of
>> gravity, and almost immediately small ontologies will spring up at
>> the edges, and bigger ontologies will bump into other big ontologies.
>>
>> MW: This process repeats, as far as I can see indefinitely. I observe
>> that - within Shell at least - the time between integrating at one
>> level and integrating at the next level up is about 10 years.
>>>
>>>> Hence the need for a universal model as a common denominator. But
>>>> it is striking that the word "interconnection" was used, rather
>>>> than "integration". Interconnection reminds me of EAI [2], so hub-
>>>> based or point-to-point, where Semantic Web integration (as I
>>>> understand it) involves a web-based distributed data base.
>>>
>>> Yes, if web-based means an overlapping set of many ontologies in a
>>> fractal distribution.
>>> In his fractal tangle, there wil be several recurring patterns at
>>> different scales.
>>> One pattern is a local integration within (say) an enterprise, which
>>> starts point-point (problems scale as n^2) and then shifts with EIA
>>> to a hub-and-spoke as you say, where the effort scales as N.    Then
>>> the hub is converted to use RDF, and that means the hub then plugs
>>> into a external bus, as it connects to shared ontologies.
>>
>> MW: That same kinds of things will happen with the shared ontologies
>> as with the enterprise ontologies (moving to a hub and spoke model
>> requires an integrating ontology that at least spans the shared  
>> data).
>>>
>>>
>>>
>>>>
>>>> Keeping in mind that, as I wrote before in this thread,
>>> application
>>>> systems store a lot of implicit data (or actually don't store
>>>> them), the direct mapping of their data to the SW formats will
>>>> cause more problems than its solves. They are based on their own
>>>> proprietary data model, and these are unintelligible for other,
>>>> equally proprietary, data models.
>>>>
>>>> The thing puzzling me is how the SW community can see what
>>> I cannot
>>>> see, and that is how on earth you can achieve what your Activity
>>>> Statement says, without such a standard generic data model and
>>>> derived standard reference data (taxonomy and ontology). But
>>>> perhaps not many SW-ers bother about the need of universal
>>>> integration, and are happily operating within their own subdomain,
>>>> such as FOAF.
>>>
>>> So the idea is that in any one message, some of the terms will be
>>> from a global ontology, some from subdomains.
>>
>> MW: Well if this means that we go out to the authoritative source for
>> reference data, rather than reinventing it, then that would be
>> consistent with what I was saying above. But at the moment, the  
>> problem
>> I see is that just about everyone thinks they have the right to be
>> an authoritative source on whatever they please. This is not useful.
>>
>>> The amount of data which can be reused by another agent will depend
>>> on how many communities they have in common, how many
>>> ontologies they
>>> share.
>>>
>>> In other words, one global ontology is not a solution to the
>>> problem,
>>
>> MW: But interestingly, something that was the sum of the  
>> authoritative
>> sources I have been talking about, would be something like a global
>> ontology (but not the only one of course - just a dominant one).
>>
>>> and a local subdomain is not a solution either.  But if each agent
>>> has uses a mix of a few ontologies of different scale, that is forms
>>> a global solution to the problem.
>>
>> MW: I'm not convinced about this, though I will concede that
>> authoritative sources might have small or large ontologies with  
>> variation
>> in the size and spread of their user base. However, I am quite  
>> confident
>> that we will only get there if we can find a way to reduce the use
>> of non-authoritative sources. Of course the web is the only chance we
>> have of being able to share these authoritative sources effectively.
>>>
>>> Tim.
>>>
>>>>
>>>> Can anybody enlighten me, at least by pointing to some useful  
>>>> links?
>>>>
>>>
>>> ummm   http://www.w3.org/DesignIssues/Fractal.html  to which I might
>>> add this explanation some time.
>>>
>>>
>>>
>>>> Regards,
>>>> Hans
>>>>
>>>> PS The above does not mean that I have no faith in the SW. On the
>>>> contrary, I preach the SW gospel. But I just want to understand
>>>> where it is moving to.
>>>>
>>>> [1] http://www.w3.org/2001/sw/Activity
>>>> [2] http://en.wikipedia.org/wiki/Enterprise_Application_Integration
>>>>
>>>> ____________________
>>>> OntoConsult
>>>> Hans Teijgeler
>>>> ISO 15926 specialist
>>>> Netherlands
>>>> +31-72-509 2005
>>>> www.InfowebML.ws
>>>> hans.teijgeler@quicknet.nl
>>>>
>>>>
>>>>
>>>> --
>>>> No virus found in this outgoing message.
>>>> Checked by AVG Free Edition.
>>>> Version: 7.5.446 / Virus Database: 268.18.6/708 - Release Date: 02-
>>>> Mar-07 16:19
>>>
>>>
>>>
>>
>>
>
> -- 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Golda Velez			http://goldavelez.info
> http://bTucson.com		be Tucson - share your info!
> http://Webglimpse.Net	search engine software		
> 		cell: (520) 440-1420
> "Help organize the world - index your own corner of the web!"
>
>
Received on Wednesday, 7 March 2007 16:25:56 UTC