W3C home > Mailing lists > Public > public-lod@w3.org > September 2011

Re: [ANN] DBpedia Spotlight v0.5 Released (Text Annotation with DBpedia)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Thu, 29 Sep 2011 14:31:43 -0400
Message-ID: <4E84B98F.1000300@openlinksw.com>
To: public-lod@w3.org
On 9/29/11 12:52 PM, Pablo Mendes wrote:
> Hi all,
> We are happy to announce the release of DBpedia Spotlight v0.5 - 
> Shedding Light on the Web of Documents.
> DBpedia Spotlight is a tool for annotating mentions of DBpedia 
> entities and concepts in text, providing a solution for linking 
> unstructured information sources to the Linked Open Data cloud through 
> DBpedia. The DBpedia Spotlight Architecture is composed by the 
> following modules:
>     * Web application, a demonstration client (HTML/Javascript UI) 
> that allows users to enter/paste text into a Web browser and visualize 
> the resulting annotated text.
>     * Web Service, a RESTful Web API that exposes the functionality of 
> annotating and/or disambiguating resources in text. The service 
> returns XML, JSON or XHTML+RDFa.
>     * Annotation Java / Scala API, exposing the underlying logic that 
> performs the annotation/disambiguation.
>     * Indexing Java / Scala API, executing the data processing 
> necessary to enable the annotation/disambiguation algorithms used.
> In this release we have provided many enhancements to the Web Service, 
> installation process, as well as the spotting, candidate selection, 
> disambiguation and annotation stages. More details on the enhancements 
> are provided below.
> The new version is deployed at:
> * http://spotlight.dbpedia.org/dev/demo/ (Demonstration Web Interface)
> * http://spotlight.dbpedia.org/dev/rest/ (Web Service)
> Instructions on how to use the Web Service are available at: 
> http://spotlight.dbpedia.org
> We invite your comments on the new version before we deploy it on our 
> production server. We will keep it on the "dev" server until October 
> 6th, when we will finally make the switch to the production server at 
> http://spotlight.dbpedia.org/demo/ and http://spotlight.dbpedia.org/rest/
> If you are a user of DBpedia Spotlight, please join 
> dbp-spotlight-users@lists.sourceforge.net 
> <mailto:dbp-spotlight-users@lists.sourceforge.net> for announcements 
> and other discussions.
> Changelog
> Changes since last public release (v0.1):
> * Uses DBpedia 3.7 resources, including types from DBpedia Ontology, 
> Freebase and Schema.org.
> * New Web API method /rest/candidates provides a ranked list of 
> candidates for each surface form. This will allow the use of DBpedia 
> Spotlight in semi-automatic annotation (e.g. of blog posts), where 
> users can "fix" a mistake made by our system by choosing another 
> candidate from the suggestions provided by the service.
> * New disambiguation implementations, including a two-step 
> disambiguator with simpler context scoring provides up to 200x faster 
> annotation with modest accuracy loss (~7%) in our preliminary tests.
> * SpotSelector classes allow one to discard non-entities early in the 
> process to improve time performance and conformance with annotation 
> policies (e.g. do not annotate common words).
> * jQuery plugin for DBpedia Spotlight allows one to annotate a Web 
> page with one line of javascript code: $('div').annotate();
> * Cross Origin Resource Sharing (CORS) is now enabled by default on 
> the Web API, allowing javascript code on your page to call our service 
> without need for proxies.
> * Enhanced candidate selection stage (with approximate matching) 
> improves coverage of candidate URIs for surface forms with small 
> variations in spelling.
> * Debian packaging allows one to install DBpedia Spotlight via the 
> package manager in many Linux distros.
> * Easier installation: fully mavenized process, auto-generated jars, 
> more configuration parameters accessible via property files.
> * Better modularization: dependence on the DBpedia Extraction 
> Framework was moved to module "index". Users that only want to run the 
> service can now ignore that dependence.
> * Web API description provided via Web Application Description 
> Language (WADL). It allows you to create clients automatically via 
> IDEs such as Eclipse, NetBeans, etc.
> * Downloads: full index, compressed index, spotter dictionaries with 
> different thresholds, etc. available from 
> http://spotlight.dbpedia.org/download
> * Removed restriction on the number of characters. Beware that short 
> texts will have lower performance since they normally provide less 
> context for disambiguation.
> * Accepts POST requests in addition to GET. This allows longer text. 
> Unless explicitly specified, long texts automatically use the 
> Document-centric (faster) disambiguation algorithm.
> * A bookmarklet allows user to select text in any Web page using their 
> good old browser and call DBpedia Spotlight directly from there in 
> order to obtain annotated text.
> Acknowledgements
> Many thanks to the growing community of DBpedia Spotlight users for 
> your feedback and energetic support. We would like to especially thank:
> * Jo Daiber for his great work on better spotters, additional types, 
> cuter interfaces and many other improvements to the tool;
> * Paul Houle for the extensive feedback on the system, great 
> suggestions for improvement and patches;
> * Scott White for the invaluable discussions on the architecture and 
> other advice;
> * Rob DiCiuccio for his real-world use case description and PHP client 
> implementation;
> * Giuseppe Rizzo for his friendly push for releasing the /candidates 
> API and feedback on the API's design;
> * Thomas Steiner and Rainer Simon for opening the Known Uses list 
> (http://dbpedia.org/spotlight/knownuses), and Rob DiCiuccio, A. 
> Elizabeth Cano et al., Ali Khalili, Raphaël Troncy and Giuseppe Rizzo 
> for letting us know of their uses of DBpedia Spotlight.
> With this release we also have the pleasure of welcoming Jo Daiber as 
> a committer. We are looking forward to continuing this fruitful 
> collaboration.
> This release of DBpedia Spotlight was supported by The European 
> Commission through the project LOD2 – Creating Knowledge out of Linked 
> Data (http://lod2.eu/).
> DBpedia Spotlight's source code is provided under the terms of the 
> Apache License, Version 2.0. Part of the code uses LingPipe under the 
> Royalty Free License. The source code can be downloaded from:
> http://sourceforge.net/projects/dbp-spotlight
> A paper describing DBpedia Spotlight was published at I-SEMANTICS 2011:
> Pablo N. Mendes, Max Jakob, Andrés García-Silva and Christian Bizer. 
> DBpedia Spotlight: Shedding Light on the Web of Documents. In the 
> Proceedings of the 7th International Conference on Semantic Systems 
> (I-Semantics). Graz, Austria, 7–9 September 2011.
> Happy annotating!
> Cheers,
> Pablo, Max, Jo, Chris

More data, more smarts, more for machines to learn from etc..... re. 
increasingly virtuous cycle of Linked Open Data!

Imagine trying to pull this off if there wasn't any existing Linked Data 
Cloud (subjectively good or bad quality) :-)



Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen

Received on Thursday, 29 September 2011 18:32:08 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:16:16 UTC