- From: Danny Ayers <danny.ayers@gmail.com>
- Date: Fri, 25 Nov 2005 13:37:33 +0100
- To: Bjoern Hoehne <semantic-web@lists.unreach.net>
- Cc: semantic-web@w3.org
On 11/23/05, Bjoern Hoehne <semantic-web@lists.unreach.net> wrote: > So before we try to figure out how to integrate the individual components > into one large architecture and before we try to crawl the Semantic Web I > want to ask this list one simple question: Is this possible? I don't see why not. Simple matter of programming ;-) Ok, limiting the search terms to 2 keywords is the kind of thing that should help with the generally hard task of understanding natural language queries, there may be other little tricks too. I saw a great demo at SWAP2004 where queries where built from natural language in sequence, each step restricting the area of the solution space, *and* the possible remaining query space. (Looks like the site hosting the SWAP2004 material is still having trouble - anyone remember/got a link for this paper?). The other hard parts I suspect will be scale and performance (fairly safe prediction ;-). Making the system have a distributed architecture from the ground up is probably a good idea, if possible. Loosely-couple components, and all that. One Web-friendly approach might be for the service to rewrite the search terms as SPARQL queries, push those on to both local and remote stores (hosted by whoever - maybe Swoogle would be a good candidate, and how about individual PiggyBanks..?) and then use the reasoning engine to make something cohesive from the results. Keying in to other data stores should also reduce the amount of work you have to do ;-) Making a service endpoint available (e.g. SPARQL-based) for machine access would increase the utility of the system massively, put it properly on the Semantic Web (it's a shame Google Base haven't twigged that yet). Another avenue to dealing with the scale/performance issues may be to support delayed results. i.e. I go query the system, it tells me what it knows *now*, but also sets async queries in action with remote systems. When they've got more results they pass these back to the centre (maybe via callback URIs). Revisiting the query site seconds/months later and asking the same question again (or following a link provided by the system) reveals the additional info. If you're looking at keyword-based search, then there's probably be a lot of potential in mining data from tagging (folksonomy) sites like del.icio.us, Technorati, Nature's academic tagging setup (who's name I've forgotten) etc, integrated into RDF/OWL using maybe a Tag Ontology (e.g. [1]) plus SKOS. Just as an example of cunning use of existing services, check [2]. Cheers, Danny. [1] http://www.holygoat.co.uk/projects/tags/ [2] http://www.hackdiary.com/archives/000070.html -- http://dannyayers.com
Received on Friday, 25 November 2005 12:37:55 UTC