- From: Nathan <nathan@webr3.org>
- Date: Tue, 02 Feb 2010 16:36:23 +0000
- CC: Davide Palmisano <davide@asemantics.com>, Matthias Samwald <samwald@gmx.at>, public-lod@w3.org
Nathan wrote: > Davide Palmisano wrote: >> On Tue, Feb 2, 2010 at 3:39 PM, Matthias Samwald <samwald@gmx.at> wrote: >>> Davide wrote: >>>> BTW: and what about http://www.alchemyapi.com ? have you tried it? >>> AlchemyAPI does not seem to return DBpedia / Wikipedia identifiers (?) >> yes, read here http://www.alchemyapi.com/api/entity/textc.html you >> need to specify a parameter to enable this feature. I'm using this >> tool with proficiency. >> > > Whilst I do like alchemy, I've found you can extract much, much more > information, of a much higher standard by combining OpenCalais and > Zemanta in the process outlined in a previous mail. > > To illustrate I'll quickly hook in with alchemy again and post a few > results for comparison shortly. > for a quick comparison I've run through two documents through both alchemy and the opencalais/zemanta/lookup combination system to see how they compare; note with the alchemy results I've also included the non-linked-data terms so you can see why I've not used it in my own system. ================================================================= TEST 1: source document: http://webr3.org/__play/optimal/webr3.html Alchemy Results ================================================================= Linked Data: http://dbpedia.org/resource/England : England http://dbpedia.org/resource/Google : Google Generic Terms: FieldTerminology : web 3.0 Company : wikipedia City : London FieldTerminology : URIs Technology : HTML5 StateOrCounty : DC City : Dublin FieldTerminology : Web Developers FieldTerminology : HTML Notes: Both "DC" and "Dublin" are incorrect, as we mentioned "Dublin Core". Combined OpenCalais / Zemanta + dbpedia lookup system: ================================================================= Linked Data: http://dbpedia.org/resource/Linked_Data : Linked Open Data, LOD http://dbpedia.org/resource/RDFa : RDFa http://dbpedia.org/resource/Semantic_Web : Semantic Web http://dbpedia.org/resource/HTML : HTML4 http://dbpedia.org/resource/Dublin_Core : Dublin Core http://dbpedia.org/resource/Resource_Description_Framework : RDF http://dbpedia.org/resource/Web_page : web pages http://dbpedia.org/resource/HTML5 : HTML5 http://dbpedia.org/resource/London : London http://dbpedia.org/resource/Web_design : web designer http://dbpedia.org/resource/United_Kingdom : United Kingdom http://dbpedia.org/resource/Web_search_engine : search engine http://dbpedia.org/resource/Web_developer : Web developer http://dbpedia.org/resource/Joe_Bloggs : Joe Blogs http://dbpedia.org/resource/XHTML : XHTML http://dbpedia.org/resource/Web_2.0 : Web 2.0 http://dbpedia.org/resource/Open_Data : Open Data http://dbpedia.org/resource/Web_standards : Web standards http://dbpedia.org/resource/FOAF_%28software%29 : FOAF http://dbpedia.org/resource/Computing : Computing http://dbpedia.org/resource/World_Wide_Web : World Wide Web ================================================================= TEST 2: source document: http://news.bbc.co.uk/1/hi/world/asia-pacific/8492608.stm Alchemy Results ================================================================= Linked Data: http://dbpedia.org/resource/People's_Republic_of_China : China http://dbpedia.org/resource/United_States : United States http://dbpedia.org/resource/Republic_of_China : Taiwan http://dbpedia.org/resource/Beijing : Beijing http://dbpedia.org/resource/Communist_Party_of_China : Chinese Communist Party http://dbpedia.org/resource/Washington,_D.C. : Washington DC http://dbpedia.org/resource/White_House : White House http://dbpedia.org/resource/Barack_Obama : Barack Obama http://dbpedia.org/resource/Google : Google http://dbpedia.org/resource/Ministry_of_Foreign_Affairs_(People's_Republic_of_China) : Chinese Foreign Ministry http://dbpedia.org/resource/Boeing : Boeing http://dbpedia.org/resource/Iran : Iran http://dbpedia.org/resource/Tehran : Tehran Generic Terms: Person : Dalai Lama Person : Mr Zhu Country : Tibet City : Washington Person : Mr Obama Person : Zhu Weiqun Technology : aerospace Person : Obama Company : BBC GeographicFeature : Himalayan Person : Ma Zhaoxu Person : Paul Reynolds Person : Kasur Lodi Gyarit Combined OpenCalais / Zemanta + dbpedia lookup system: ================================================================= Linked Data: http://dbpedia.org/resource/Communist_Party_of_China : Chinese Communist Party http://dbpedia.org/resource/Barack_Obama : Barack Obama http://dbpedia.org/resource/United_States : United States http://dbpedia.org/resource/People%27s_Republic_of_China : China http://dbpedia.org/resource/Washington%2C_D.C. : DC, Washington DC http://dbpedia.org/resource/China : Sino http://dbpedia.org/resource/President_of_the_United_States : US President http://dbpedia.org/resource/Dalai_Lama : Dalai Lama http://dbpedia.org/resource/Republic_of_China : Taiwan http://dbpedia.org/resource/Arms_industry : arms sales http://dbpedia.org/resource/Official : Official http://dbpedia.org/resource/Ma_Zhaoxu : Ma Zhaoxu http://dbpedia.org/resource/Paul_Reynolds : Paul Reynolds http://dbpedia.org/resource/Internet_censorship : Internet censorship http://dbpedia.org/resource/Tibet : Tibet http://dbpedia.org/resource/United_Front_Work_Department : United Front Work Department http://dbpedia.org/resource/Web_search_engine : search engine http://dbpedia.org/resource/Zhu_Weiqun : Zhu Weiqun http://dbpedia.org/resource/Communist_party : Communist party http://dbpedia.org/resource/Himalayan : Himalayan http://dbpedia.org/resource/Boeing : Boeing http://dbpedia.org/resource/Washington : Washington http://dbpedia.org/resource/Ministry_of_Foreign_Affairs_of_the_People%27s_Republic_of_China : Chinese Foreign Ministry http://dbpedia.org/resource/Beijing : Beijing http://dbpedia.org/resource/Spiritual_leader : Spiritual leader http://dbpedia.org/resource/Correspondent : Correspondent http://dbpedia.org/resource/Tehran : Tehran http://dbpedia.org/resource/BBC : BBC http://dbpedia.org/resource/President : President http://dbpedia.org/resource/Spokesman : Spokesman http://dbpedia.org/resource/White_House : White House http://dbpedia.org/resource/GBP_%28disambiguation%29 : GBP http://dbpedia.org/resource/United_States_dollar : USD http://dbpedia.org/resource/Islamic_Republic_of_Iran : Islamic Republic of Iran http://dbpedia.org/resource/Dalai_Lama_Renaissance : Dalai Lama Renaissance http://dbpedia.org/resource/14th_Dalai_Lama : 14th Dalai Lama http://dbpedia.org/resource/Lhasa : Lhasa http://dbpedia.org/resource/Central_Tibetan_Administration : Politics of Tibet http://dbpedia.org/resource/Buddhism : Buddhism ================================================================= Quite sure the results speak for themselves + glad that so much useful information can be extracted from text all ready. It is worth noting that the combined system takes between 4-7 seconds (with cache) so it's definitely lacking in that respect! Regards, Nathan
Received on Tuesday, 2 February 2010 16:37:13 UTC