Re: A vision about a Semantic Web Search&Reasoning Engine from Danny Ayers on 2005-11-25 (semantic-web@w3.org from November 2005)

From: Danny Ayers <danny.ayers@gmail.com>
Date: Fri, 25 Nov 2005 13:37:33 +0100
To: Bjoern Hoehne <semantic-web@lists.unreach.net>
Cc: semantic-web@w3.org
Message-ID: <1f2ed5cd0511250437y73a6754aka28d10e197bce53a@mail.gmail.com>

On 11/23/05, Bjoern Hoehne <semantic-web@lists.unreach.net> wrote:

> So before we try to figure out how to integrate the individual components
> into one large architecture and before we try to crawl the Semantic Web I
> want to ask this list one simple question: Is this possible?

I don't see why not. Simple matter of programming ;-)

Ok, limiting the search terms to 2 keywords is the kind of thing that
should help with the generally hard task of understanding natural
language queries, there may be other little tricks too. I saw a great
demo at SWAP2004 where queries where built from natural language in
sequence, each step restricting the area of the solution space, *and*
the possible remaining query space. (Looks like the site hosting the
SWAP2004 material is still having trouble - anyone remember/got a link
for this paper?).

The other hard parts I suspect will be scale and performance (fairly
safe prediction ;-). Making the system have a distributed architecture
from the ground up is probably a good idea, if possible.
Loosely-couple components, and all that. One Web-friendly approach
might be for the service to rewrite the search terms as SPARQL
queries, push those on to both local and remote stores (hosted by
whoever - maybe Swoogle would be a good candidate, and how about
individual PiggyBanks..?) and then use the reasoning engine to make
something cohesive from the results.  Keying in to other data stores
should also reduce the amount of work you have to do ;-)

Making a service endpoint available (e.g. SPARQL-based) for machine
access would increase the utility of the system massively, put it
properly on the Semantic Web (it's a shame Google Base haven't twigged
that yet).

Another avenue to dealing with the scale/performance issues may be to
support delayed results. i.e. I go query the system, it tells me what
it knows *now*, but also sets async queries in action with remote
systems. When they've got more results they pass these back to the
centre (maybe via callback URIs). Revisiting the query site
seconds/months later and asking the same question again (or following
a link provided by the system) reveals the additional info.

If you're looking at keyword-based search, then there's probably be a
lot of potential in mining data from tagging (folksonomy) sites like
del.icio.us, Technorati, Nature's academic tagging setup (who's name
I've forgotten) etc, integrated into RDF/OWL using maybe a Tag
Ontology (e.g. [1]) plus SKOS.

Just as an example of cunning use of existing services, check [2].

Cheers,
Danny.

[1] http://www.holygoat.co.uk/projects/tags/
[2] http://www.hackdiary.com/archives/000070.html

--

http://dannyayers.com

Received on Friday, 25 November 2005 12:37:55 UTC