Re: Fractal communities: Was: Rich semantics and expressiveness from Richard Cyganiak on 2007-03-09 (semantic-web@w3.org from March 2007)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Fri, 9 Mar 2007 15:49:13 +0100
To: Golda Velez <gv@btucson.com>
Cc: Golda Velez <w3@webglimpse.org>, semantic-web@w3.org
Message-Id: <ACD231DE-01D5-485A-B46F-1A02F7A1091F@cyganiak.de>
On 8 Mar 2007, at 18:23, Golda Velez wrote:
> BTW I like this definition, Richard has at
> http://sites.wiwiss.fu-berlin.de/suhl/bizer/bookmashup/index.html
> ---------------------------
> The basic requirements that data has to fulfill in order to be part  
> of the
> Semantic Web are:
> 1) All entities of interest, such as information resources, real-world
> objects, and vocabulary terms should be identified by URI references.
> 2) URI references should be dereference-able, meaning that an  
> application can
> look up a URI over the HTTP protocol and retrieve RDF data about the
> identified resource.
> 3) Data should be provided using the RDF/XML syntax.
> 4) Data should be interlinked with other data. Thus resource  
> descriptions
> should contain links to related information in the form of  
> dereference-able
> URIs within RDF statements or as rdfs:seeAlso links.
> ------------------------------
>
> Is that generally accepted as the definition?

It's not universally accepted.

The most common objections are: a) “lo-fi semantic content” such as  
RSS/Atom, tags, microformats, GRDDL and RDFa are also part of the  
Semantic Web, b) other transports than HTTP, such as P2P protocols,  
can also be part of the Semantic Web, and c) RDF itself is not  
“sufficiently semantic”, data needs to conform to an OWL-DL ontology  
to be part of the Semantic Web.

They all have a point. The definition above is really just a useful  
middle ground around which tool builders and content providers can  
gather. The Interlinking Open Data community project [1] is a quite  
active forum where this happens.

> And does it allow 'leaf' nodes
> which are linked to, but contain no links themselves?

Yes, sure--for example, you can have a resource whose properties all  
have literal values.

Richard

[1] http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/ 
LinkingOpenData



>
> --Golda
>
>
> On Wednesday 07 March 2007 09:25, Richard Cyganiak wrote:
>> Golda,
>>
>> On 7 Mar 2007, at 17:37, Golda Velez wrote:
>>> - about the ISBN: stuff.  Of course we should be able to use those
>>> identifiers, but yes they need a namespace otherwise someone will be
>>> referring to the "International Society for Butter Nuts".  That can
>>> be solved
>>> if the info: people or someone else offers to provide a stable URL
>>> referencing them.  Like, info://ns.info/2007/isbn/NNNNNNN
>>> If the namespace owner is trusted and says they will not reinvent
>>> the meaning
>>> of that URL, that's good enough.  It would be nice if w3 or one
>>> other central
>>> org was in the habit of providing such urls.  Note a web page or  
>>> HTTP
>>> anything is not necessary if its not a full ontology - its just a
>>> URL to ref
>>> a concept!
>>
>> If it's OK with you that the URI is not resolvable, then why not just
>> use urn:isbn:XXXXX, it's a standard [1].
>>
>> Of course it would be nice if someone would provide HTTP URIs that
>> actually return useful data. We tried to do this with the RDF Book
>> Mashup [2], using URIs like this:
>>
>> http://www4.wiwiss.fu-berlin.de/bookmashup/books/006251587X
>>
>> It uses the Amazon API to retrieve some data about the book. But as
>> you say, a highly trusted namespace owner would be better. I don't
>> think this is W3C's job though. Setting up an HTTP server for this
>> kind of stuff isn't *that* hard.
>>
>>> - tools to go back and forth from the RDF world for us old sql
>>> programmers.  I
>>> have a semi-semantic web type app meant for joe user that has 8000+
>>> categories and 80000+ items that can all relate to each other, and
>>> needs
>>> flexible and fast ways to display subsets.  So I have it all in
>>> mysql.  There
>>> is a table with a RELATION field with a small set of allowed
>>> relations.  It
>>> seems to me there should eventually be simple tools for porting
>>> this kind of
>>> data into and out of RDF.   Not everyone wants to use RDF as the
>>> back end all
>>> the time..and not all the features of ontologies are needed for all
>>> applications, but any valid semantic data should be able to be
>>> exposed for
>>> use.
>>
>> Very true, and a number of applications exist for this purpose. See
>> [3] for a list. (I'm one of the authors of D2RQ and D2R Server, which
>> I think are fairly good at solving this kind of problem, though not
>> yet as simple as I'd like. Check out [4] for a bibliography database
>> of 800k+ records mapped to RDF using D2R Server.)
>>
>> Yours,
>> Richard
>>
>> [1] http://www.ietf.org/rfc/rfc3187.txt
>> [2] http://www4.wiwiss.fu-berlin.de/bookmashup/
>> [3] http://esw.w3.org/topic/RdfAndSql
>> [4] http://www4.wiwiss.fu-berlin.de/dblp/
>>
>>
>>> - it seems to me some relations are pretty universal because they  
>>> are
>>> abstract.  I'd like to see a lot of reuse in different ontologies
>>> of things
>>> like
>>> 	IS_LOCALIZATION_OF
>>> 	IS_PERSONALIZATION_OF
>>> 	DEPENDS_ON
>>> 	CREATED_BY
>>> and so on, these types of words are abstractions humans can relate
>>> to totally
>>> different fields, and could be used pretty universally, no?
>>>
>>> Well, as usual I've probably missed the point, but anyway once all
>>> the semweb
>>> stuff is simple enough for me to use then you know you have  
>>> something
>>> happening.
>>>
>>> Last point, probably showing my lack of research - so where is the
>>> registration place with simple categories for all semweb related
>>> things?
>>>
>>> bye all
>>>
>>> --Golda
>>>
>>> ps I wrote a more in-depth (even than the above lengthy mail)
>>> explanation of
>>> these kind of points at http://webglimpse.net/Mapping.pdf . mainly
>>> its an
>>> argument for the info: people or someone trusted to make versioned,
>>> extensible concept urls for standard concept schemes that haven't  
>>> been
>>> webified yet.
>>>
>>>
>>>
>>>
>>> On Wednesday 07 March 2007 01:20, matthew.west@shell.com wrote:
>>>> <snip>
>>>>> An agent plays a role in many different
>>>>> overlapping communities.  When I tag a photo as being of my
>>>>> car, or I
>>>>> agree to use my car in a car pool, or when I register the car with
>>>>> the Registry of Motor Vehicles, I probably use different
>>>>> ontologies.   There is some finite  effort it would take to
>>>>> integrate
>>>>> the ontologies, to establish some OWL (or rules, etc) to link  
>>>>> them.
>>>>>
>>>>> - Everyone is encouraged to reuse other people's classes and
>>>>> properties to the greatest extent they can.
>>>>
>>>> MW: One of the counterbalances I find to this is that it is often
>>>> easier/cheaper to reinvent classes than find them (usually lots of
>>>> versions) and decide if any of them really meet your needs. I  
>>>> know I
>>>> see a lot of reinvention.
>>>>
>>>>> - Some ontologies will already exist and by publicly shred by  
>>>>> many,
>>>>> such as ical:dtstart, geo:longitude, etc.  This is the single  
>>>>> global
>>>>> community.
>>>>
>>>> MW: This is a pure guess, but if we take longitude as an example I
>>>> would be very surprised if there were not at least 100 publicly
>>>> available ontologies that defined longitude. To reduce this, one
>>>> of the things I think we need to do is to develop a sense of
>>>> authoritative source. We need to ask ourselves the question: who
>>>> "owns" this? What is *their* name/definition? This is something we
>>>> try to do with out own reference data. So we recognise ISO country
>>>> codes, rather than invent our own, we recognise a companies product
>>>> name/code when we buy their product, and the companies registered
>>>> name and number, rather than our abbreviation or version of it.
>>>>
>>>>> - Some ontologies will be established by smaller communities of  
>>>>> many
>>>>> sizes.
>>>>>
>>>>> Why do I think the structure should be will be fractal?  Clearly
>>>>> there will be many more small communities, local ontologies, than
>>>>> global ones. Why a 1/f distribution? Well, it seems to occur in  
>>>>> many
>>>>> systems including the web, and may be optimal for some problems.
>>>>> That we should design for a fractal distribution of ontologies  
>>>>> is a
>>>>> hunch.  But it does solve the issue you raise.  Some aspects of  
>>>>> the
>>>>> web have been shown to be fractal already.
>>>>>
>>>>> Here are some properties of the interconnections:
>>>>>
>>>>> - The connections between the ontologies may be made after their
>>>>> creation, not necessarily involving the original ontology  
>>>>> designers.
>>>>> - There is a cost of connecting ontologies, figuring out how they
>>>>> connect, which people will pay when and only when they need the
>>>>> benefit of extra interoperability.
>>>>> - Sometimes when connecting ontologies, it is so awkward there is
>>>>> pressure to change the terms that one community uses to fit
>>>>> in better
>>>>> with the other community. Again, a finite cost to make the change,
>>>>> against a benefit or more interop.
>>>>
>>>> MW: This is close to the dynamic view that I see. I see ontologies
>>>> start in isolation and then grow. Eventually, they bump into  
>>>> adjacent
>>>> ontologies that have also been growing (many will die of course).
>>>>
>>>> MW: When enough ontologies overlap in a sufficiently annoying and
>>>> expensive way, an effort is undertaken to integrate these  
>>>> ontologies
>>>> to better support integration. This produces an increased centre of
>>>> gravity, and almost immediately small ontologies will spring up at
>>>> the edges, and bigger ontologies will bump into other big  
>>>> ontologies.
>>>>
>>>> MW: This process repeats, as far as I can see indefinitely. I  
>>>> observe
>>>> that - within Shell at least - the time between integrating at one
>>>> level and integrating at the next level up is about 10 years.
>>>>>
>>>>>> Hence the need for a universal model as a common denominator. But
>>>>>> it is striking that the word "interconnection" was used, rather
>>>>>> than "integration". Interconnection reminds me of EAI [2], so  
>>>>>> hub-
>>>>>> based or point-to-point, where Semantic Web integration (as I
>>>>>> understand it) involves a web-based distributed data base.
>>>>>
>>>>> Yes, if web-based means an overlapping set of many ontologies in a
>>>>> fractal distribution.
>>>>> In his fractal tangle, there wil be several recurring patterns at
>>>>> different scales.
>>>>> One pattern is a local integration within (say) an enterprise,  
>>>>> which
>>>>> starts point-point (problems scale as n^2) and then shifts with  
>>>>> EIA
>>>>> to a hub-and-spoke as you say, where the effort scales as N.     
>>>>> Then
>>>>> the hub is converted to use RDF, and that means the hub then plugs
>>>>> into a external bus, as it connects to shared ontologies.
>>>>
>>>> MW: That same kinds of things will happen with the shared  
>>>> ontologies
>>>> as with the enterprise ontologies (moving to a hub and spoke model
>>>> requires an integrating ontology that at least spans the shared
>>>> data).
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Keeping in mind that, as I wrote before in this thread,
>>>>> application
>>>>>> systems store a lot of implicit data (or actually don't store
>>>>>> them), the direct mapping of their data to the SW formats will
>>>>>> cause more problems than its solves. They are based on their own
>>>>>> proprietary data model, and these are unintelligible for other,
>>>>>> equally proprietary, data models.
>>>>>>
>>>>>> The thing puzzling me is how the SW community can see what
>>>>> I cannot
>>>>>> see, and that is how on earth you can achieve what your Activity
>>>>>> Statement says, without such a standard generic data model and
>>>>>> derived standard reference data (taxonomy and ontology). But
>>>>>> perhaps not many SW-ers bother about the need of universal
>>>>>> integration, and are happily operating within their own  
>>>>>> subdomain,
>>>>>> such as FOAF.
>>>>>
>>>>> So the idea is that in any one message, some of the terms will be
>>>>> from a global ontology, some from subdomains.
>>>>
>>>> MW: Well if this means that we go out to the authoritative  
>>>> source for
>>>> reference data, rather than reinventing it, then that would be
>>>> consistent with what I was saying above. But at the moment, the
>>>> problem
>>>> I see is that just about everyone thinks they have the right to be
>>>> an authoritative source on whatever they please. This is not  
>>>> useful.
>>>>
>>>>> The amount of data which can be reused by another agent will  
>>>>> depend
>>>>> on how many communities they have in common, how many
>>>>> ontologies they
>>>>> share.
>>>>>
>>>>> In other words, one global ontology is not a solution to the
>>>>> problem,
>>>>
>>>> MW: But interestingly, something that was the sum of the
>>>> authoritative
>>>> sources I have been talking about, would be something like a global
>>>> ontology (but not the only one of course - just a dominant one).
>>>>
>>>>> and a local subdomain is not a solution either.  But if each agent
>>>>> has uses a mix of a few ontologies of different scale, that is  
>>>>> forms
>>>>> a global solution to the problem.
>>>>
>>>> MW: I'm not convinced about this, though I will concede that
>>>> authoritative sources might have small or large ontologies with
>>>> variation
>>>> in the size and spread of their user base. However, I am quite
>>>> confident
>>>> that we will only get there if we can find a way to reduce the use
>>>> of non-authoritative sources. Of course the web is the only  
>>>> chance we
>>>> have of being able to share these authoritative sources  
>>>> effectively.
>>>>>
>>>>> Tim.
>>>>>
>>>>>>
>>>>>> Can anybody enlighten me, at least by pointing to some useful
>>>>>> links?
>>>>>>
>>>>>
>>>>> ummm   http://www.w3.org/DesignIssues/Fractal.html  to which I  
>>>>> might
>>>>> add this explanation some time.
>>>>>
>>>>>
>>>>>
>>>>>> Regards,
>>>>>> Hans
>>>>>>
>>>>>> PS The above does not mean that I have no faith in the SW. On the
>>>>>> contrary, I preach the SW gospel. But I just want to understand
>>>>>> where it is moving to.
>>>>>>
>>>>>> [1] http://www.w3.org/2001/sw/Activity
>>>>>> [2] http://en.wikipedia.org/wiki/ 
>>>>>> Enterprise_Application_Integration
>>>>>>
>>>>>> ____________________
>>>>>> OntoConsult
>>>>>> Hans Teijgeler
>>>>>> ISO 15926 specialist
>>>>>> Netherlands
>>>>>> +31-72-509 2005
>>>>>> www.InfowebML.ws
>>>>>> hans.teijgeler@quicknet.nl
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> No virus found in this outgoing message.
>>>>>> Checked by AVG Free Edition.
>>>>>> Version: 7.5.446 / Virus Database: 268.18.6/708 - Release  
>>>>>> Date: 02-
>>>>>> Mar-07 16:19
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>> -- 
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> Golda Velez			http://goldavelez.info
>>> http://bTucson.com		be Tucson - share your info!
>>> http://Webglimpse.Net	search engine software		
>>> 		cell: (520) 440-1420
>>> "Help organize the world - index your own corner of the web!"
>>>
>>>
>>
>
> -- 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Golda Velez		http://goldavelez.info
> Webglimpse.Net		http://webglimpse.net
> Internet WorkShop	http://iwhome.com
> 	cell: (520) 440-1420
> "Help organize the world - index your own corner of the web!"
>
Received on Friday, 9 March 2007 14:49:24 UTC