Re: A discussion: Is semantic web an old fashioned idea?Is it bubble,unworthy or an interesting researcharea - Post your c omments

Dan Brickley a écrit :

> What we are trying to do is provide a basis for arbitrary
> application data formats to share common structures (eg. the bits that
> represent Documents might use Dublin Core, the bits that represent
> Persons might use FOAF, the bits that represent calendars/schedules
> might use rdf-ical, ...), alongside their own, application-specific data
> structures. 

Fair enough!
We have to agree on the subject matter to be debated.
I still think this (much) more modest goal is beyond the current technology
and will try to explain why, I will also try to clarify
exactly WHAT are the claims I contest, correct me if I am wrong again in guessing about your purposes.

>Not a complicated idea. 

The "idea" (as a goal) is not complicated, yes, 
the underlying problems ARE complicated, tremendously,
and I don't speak of nitty-gritty implementation details
which are horrendously time consuming (silly syntax and the like) but amenable to brute force solutions, whether cpu cycles or developpers toil.

I mean *conceptual* difficulties which are happily overlooked.

> This is very different from a
> fairytale effort to enable abitrary pairs of applications to
> "meaningfully interconnect" in the stronger sense of working
> seamlessly together. The Semantic Web project is an effort
> to remove _arbitrary_ and wasteful barriers between applications.
> We make it possible for them to share some structure, and have partial
> understanding of data created elsewhere.
>
> If you think W3C materials on the SW somehow give the impression we are
> trying to build thinking machines, flying saucers, never-empty coffee
> cups or software that automatically works perfectly with all other
> software, please cite the URLs so we can refine our materials.

TBL seminal paper:
http://www.w3.org/DesignIssues/Semantic.html

TBL>If an engine of the future combines a reasoning engine   with a search engine, it may be able to get the best of both worlds, 
   >and actually be able to construct proofs in a certain number of cases of very real impact. 
   >It will be able to reach out to indexes which contain very complete lists of all occurrences of a given term, and then use logic to weed out all but those which can be of use in solving the given problem."

To me, such a quote qualify as "fairytale".
Willing it or not, this is some form of "elementary" thinking machine which is 
not realizable IN THE CONTEXT YOU PRETEND TO BUILD IT.

To deserve the name of "logic" a rule system must ensure consistency, never leading to a contradiction.
This is extremely difficult to attain in a closed and controled environment,
absolutely NO chance to get this when the "rules" of inference will come
from all over the place (integrity constraints from various data sources used in a single application).
And as for checking the consitency of the bunch of rules on the fly, forget it, it is not a matter of "just a few (say two) steps of inference" (TBL quote),
it is usually intractable, if not outright undecidable,
Cycorp has been trying a similar game for about 20 years,
with, I would say, more "serious" concerns about the underlying formal problems, and they are still failing.

So, what will happen when conflicting rules from different sources will be used?
Garbage as a result, but we are already familiar with this, it's just called a BUG!
Do you see any difference?

Then, there is the ontology requirement.

Great idea, yes, this IS the proper way to describe the structure and meaning of a SINGLE, closed, stable environment and a mandatory set of "hooks" to the inference rules:
The term occurrences in the rules ARE entries to the ontology.
Unfortunately, no two ontologies, even for the same application domain are likely to be compatible or mergeable.
Although the "main concepts" are, of course, strongly related the "fringe" cases are nearly always NOT amenable to a common definition or "understanding".

Ontology reconciliation is an open problem.
If not, please show me evidence of actual results or imminent breakthrough.

Also, ontologies are not static (just like any kind of live data),
upgrades need to be applied ("Service pack", does that ring a bell?) and
stored historical data need to be kept compatible.
Any experience available on ontologies upgrading?
I bet this will be much, much worse than the current software mess,
this promise to be part of the (aggravated) problem rather than a solution.
Some automation would be in order, and this is an *interesting research problem*...

Going closer to the "nitty-gritty" is the METADATA.

Another great idea, except for the following:

EXPLICIT metadata brings trouble for these reasons:
- It is computed in advance of its use and, since it depends on intended use, *some* metadata may just be obsolete when its use is attempted. Another maintenance problem.
- It has to be stored somewhere, increasing the storage requirements and bandwidth requirements on transmission.

EMBEDDED metadata is the absolute scourge:
- It has to be "stuffed in" at some point AND maintained too.
- It enforce (in nearly any case) a rigid tree-like hierarchy which is unlikely to fit the real need for structure description.
- It freezes, casts in concrete, the ontology inadequacies, making it more difficult to evolve.
- Last but not least, it impose a SINGLE "master" structuration of the data, leaving no opportunity for alternate views and meanings of a data set.

To summarize:

- Purported "advances" brought about by Semantic Web technologies are likely to bring more problems than solutions WITHIN THE CONTEXT OF USE THEY ARE SOLD FOR.

- Semantic Web technologies ARE usefull and cost effective in some limited contexts of high value applications.

- Most of the announced features are in fact still in the research stage and not likely to reach maturity anytime soon.

For those who did not bother to read
http://www.kmentor.com/socio-tech-info/archives/000330.html
here is a quote from Mentor Cana:

MC> Further, the article states:
> 
> "The Semantic Web is based on two fundamental concepts: 1) The description of the meaning of the content in the Web, and 2) The automatic manipulation of these meanings."
> 
> As far as 1) is concerned, the description itself is just another data (or information), i.e. metadata (or metainformation). In any case, the proper software tools have to be build to 'understand' the metadata/metainformation.
> 
> As far as 2) is concerned and the manipulation of meanings, this is a bit skeptical because to the machines, as I've tried to explain elsewhere here and here, those descriptions are just data it can manipulate and not meanings.

A little decency would suggest that the word "Semantic" be scraped,
John Sowa is absolutely right about that:

JS> But what
  > bothers me about the semantic web is that people have
  > been pumping so much hot air into this balloon that
  > it will inevitably explode and end up tarnishing any
  > technology is remotely associated with word "semantics".

Comments and rebuttals?

-- Jean-Luc Delatre

P.S. There are indeed research directions which deserve the
epithet "Semantic"
http://lsa.colorado.edu/papers/plato/plato.annote.html
Unfortunately they don't mesh with logic and are good only at search.
Could be enough, Microsoft and Google are pretty aware of that ;-)
But this is another story.
-----------------------------------------------------------------------
"For every complex problem there is an answer
 that is clear, simple, and wrong." -- H L Mencken
-----------------------------------------------------------------------
 http://perso.club-internet.fr/jld/  -- GSM: +33 6 11 24 06 29

Received on Friday, 18 June 2004 06:00:46 UTC