Myths of the Semantic Web - Popular Misconceptions for Why it Won't Work

SWEO Group-

I am at the airport now after participating in a panel talk at the InfoWorld
SOA Executive Forum (actually filling in for Susie who had prior
commitments) and there was some discussion that I thought relevant to our
efforts here.

Of the many hesitations and criticisms raised about the Semantic Web vision,
the two most prominent ones were essentially the following:

(1) The Semantic Web requires people to (re) tag everything, and
(2) The Semantic Web is a top down (central ontology) approach that eschews
the way people really work

Of course these are misconceptions of the gravest kind, and also persistent
untruths that have lingered around for many years.  I personally would
consider SWEO a successes if these memes can somehow be reversed.

The broader theme replayed here, among other sources in popular media, was
that Web 2.0 is by the people, for the people - whereas Semantic Web is by
the academics and not really useful for much at all.

My approach at this panel was to build upon an example levied by the
moderator with GIS (geographic information systems) as the topic area. We
were in the midst of discussing how the semantic web can disambiguate the
term "location," when the moderator assumes that "lat" and "long" are
universally accepted attributes of location.

I used an example developed by myself, Xavier Lopez (oracle), and  John
Goodwin (UK Ordnance Survey) last year. Consider the concept,
"EmergencyEvacuationCenter" (EEC) the semantic web languages allow us to
specify this concept declaratively as the intersection of multiple
attributes, perhaps including, "SquareFootage," "FacilityTypes,"
"Elevation," "ProximityToFloodBarriers," etc. Since we can do this
declaratively, we absolutely do not need to tag "Building" data in multiple
databases directly as being (or not being) an "EmergencyEvacuationCenter."
Instead, different user communities may define the attributes of an EEC in
their own way (eg: policies) and declaratively retrieve data about which
Buildings fit their own definition of EEC directly. Thus, neither
"exhaustive tagging" nor a "shared definition" of what an
"EmergencyEvacuationCenter" is defined as need be required.

In this way, we can support a continuously evolving set of multiple, equally
valid "truths" about the data. Decidedly different from Web 2.0.

Another, more accurate IMHO, issue raised with the Semantic Web has to do
with the archival aspect of decades worth of data. When the scope of data
analysis needs to span years, decades, and centuries, both the scale and
provenance capability of a SemWeb infrastructure can be brought into
question.  I don't have any pat answers to dissuade this concern, but maybe
some of you here do.  Please share.

One possible, albeit limited, approach is to consider the metaphor of a
"Jukebox" verses an iPod.  When we listen to music on an iPod, all our songs
are there (eg: on the hard drive) available instantaneously - however, in
the days of a Jukebox, the machine had to mechanically fetch an album and
put it under the needle to play. Similarly, I think there is a way to handle
vast amounts of SemWeb data in this manner by storing the XML serialization
separately from the place where instantiate the Graph. In other words,
possibly using straight text search (such as Google) to find sets of
historic models which are then individually loaded and instantiated within a
graph database, abox or whatever for runtime queries using the inference
expressivity available in that moment.

Of course this won't work where graph edges need to be materialized from
property relations that span an entire and complete set of historic models -
but since the current state of the art prevents the loading of 100's or
1000's of billions of triples/individuals we must find some viable, if
partial, workarounds for the community. In the past I've heard this notion
discussed as "waxing the floor" - when we wax our floors routinely, we only
wax the 10% that gets the highest traffic. Similarly, we need not
instantiate (materialize) entire graphs at once, instead only instantiate
the sets that most probably relevant to the query at hand.

Shifting gears.

Paul's link & comments
[http://lists.w3.org/Archives/Public/public-sweo-ig/2006Oct/0027.html] as
well as mine [above] regarding the criticisms levied at the semantic web
should cause us to take pause, then perhaps take inventory of the public's
current misperceptions.  Once we can say that we've addressed current
misunderstandings, with exemplar use cases perhaps, then perhaps we can move
along to suggest even more broad-based and visionary value. It seems to me
that we should first start by getting people past the intellectual hurdle
that the Semantic Web is more than "research fun" by pointing out where
their assumptions are incorrect.

I tried to address this, albeit briefly, in a blog last year
[http://alwayson.goingon.com/permalink/post/5629] where I discussed the
popular myths of the Semantic Web.  One thing that seems as much true today
as it did then is that Clay Shirkey's simplistic critique of the Semantic
Web [http://www.shirky.com/writings/semantic_syllogism.html] is still being
referenced by many as their understanding for why this won't succeed.

There are plenty of "Myths" out there, such as:

-	Semantic Web makes you tag everything again
-	Semantic Web requires a single global ontology
-	Semantic Web won't scale enough to be useful
-	Semantic Web is too complex for people to ever understand
-	Semantic Web is only about trivial syllogisms
-	Semantic Web is not substantively better than XML
-	Semantic Web is for academia
-     Semantic Web is top down, whereas Web 2.0 is bottom up (thus better)

There are powerful examples for why each of these is untrue - does this
group feel it would be worthwhile to collectively deliver a message about
these popularisms?

-Jeff-



-----Original Message-----
Hi to Jeff and all those who introduced themselves recently.

 

Thought you might be interested to see this from TechCrunch UK
http://uk.techcrunch.com/2006/10/30/tagging-microformats-and-rss-beat-the-se
mantic-web/ 

 

Paul

Received on Wednesday, 8 November 2006 13:08:43 UTC