[Fwd: [DBGROUP ML] Seminario Lunedi' 12:45]

interessante ...

Forwarded message 1

  • From: Stefano Ceri <ceri@elet.polimi.it>
  • Date: Fri, 08 May 2009 15:08:32 +0200
  • Subject: [DBGROUP ML] Seminario Lunedi' 12:45
  • To: <dbgroup@elet.polimi.it>
  • Message-ID: <list-23690887@elet.polimi.it>
Ore 12:45, sala Seminari

Cristian Duda, ETH Zurich

AJAX Crawl: Making AJAX Applications Searchable

Abstract:

Current search engines such as Google and Yahoo! are prevalent for searching the Web. 
Search on dynamic client-side Web pages is, however, either inexistent or far from 
perfect, and not addressed by existing work, for example on Deep Web. This is a real 
impediment since AJAX and Rich Internet Applications are already very common in the Web. 
AJAX applications are composed of states which can be seen by the user, but not by the 
search engine, and changed by the user using client-side events. Current search engines 
either ignore AJAX applications or produce false negatives. The reason is that crawling 
clientside code is a difficult problem that cannot be solved naively by invoking user 
events. The challenges are: lack of caching, duplicate states detection, very granular 
events, reducing the number of AJAX calls and infinite event invocation. This paper sets 
the stage for this new search challenge and proposes a solution: it shows how an AJAX 
Web application can be crawled in the granularity of the application states. A model of 
AJAX Web sites is presented. An AJAX Crawler and optimizations for caching and duplicate 
elimination are defined, and finally, the gain in search result quality and 
corresponding performance price are evaluated on YouTube, a real AJAX application.

Biography:
Cristian Duda is a recent PhD Graduate from ETH Zurich (Swiss Institute for Technology). 
His research interest lies between information retrieval and databases. His research 
triggers searching application data on the desktop and the enterprise world, as well as 
searching dynamic Web Applications (AJAX, Rich Internet Applications) which are 
incorrectly searched by current search engines. Generally, Web technologies such as Web 
Services, XML, and Web Application Frameworks are permanent sources of inspiration in 
his research.



Prof. Stefano Ceri
Dipartimento di Elettronica e Informazione
Piazza L. da Vinci 32 - 20133 Milano
http://home.dei.polimi.it/ceri/
Tel. +39-02-23993532
Fax. +39-02-23993411


#############################################################
This message is sent to you because you are subscribed to
  the mailing list <dbgroup@elet.polimi.it>.
To unsubscribe, E-mail to: <dbgroup-off@elet.polimi.it>
To switch to the DIGEST mode, E-mail to <dbgroup-digest@elet.polimi.it>
To switch to the INDEX mode, E-mail to <dbgroup-index@elet.polimi.it>
Send administrative queries to  <dbgroup-request@elet.polimi.it>

Received on Friday, 8 May 2009 13:50:44 UTC