- From: Margaret Warren <mm@zeroexp.com>
- Date: Fri, 17 Jan 2014 16:59:41 -0500
- To: <public-lod@w3.org>
-----Original Message----- From: Margaret Warren [mailto:mm@zeroexp.com] Sent: Friday, January 17, 2014 4:59 PM To: 'Hugh Glaser' Subject: RE: General tuning for Dbpedia Spotlight Hi Hugh, You can try out various word combinations..at: http://www.imagesnippets.com if you register an account, all you have to do is select one of the sample images (so you don't have to upload any) - go to the 'Description' tab, type in any text and push the auto-entity extraction button. We return matching entities from dbPedia based on how Michael Brunnbauer has it configured - we use a combination of dbPedia Spotlight and TextRazor bottom line is - feel free to type in lots of word combinations they don't have to match the image and you don't have to ' do' anything with the responses when they come back, just type in more text and try again. Ultimately, if you want to create triples in the triple-editor window you can (you are not restricted to our properties - just type in any properly formatted URI (or add the namespace in the namespace button). Once you create the triples, you can copy and paste them out of the 'View HTML/RDFa' button We don't have a way for users to tweak parameters here, but you can certainly tweak certain word combinations and see what is returned. I think we have a per day limit right now with text razor before I need to pay for it, but we haven't come close to reaching that limit yet. Best, Margaret Warren -----Original Message----- From: hugh.glaser@seme4.com [mailto:hugh.glaser@seme4.com] On Behalf Of Hugh Glaser Sent: Friday, January 17, 2014 11:26 AM To: public-lod community Subject: Re: General tuning for Dbpedia Spotlight Thank you for the responses, both on- and off-list. So I see perhaps I should recast my question, with maybe wider scope. I have a load of abstract-style text fragments - that is perhaps 100 words each, on a wide variety of topics, although there is a bit of a technical bent. I want to be able to do linkage between them and to other things, based around our lovely Linked Data world. That is, have lots triples something like :docIDn :some-pred :conceptURI It would be a bonus to know which words in the text triggered the generation of the triple. Of course, the system doesn't actually have to generate the triples - I can build them if I get sufficiently sensible output, including the sort of html output that Spotlight does. And because it goes automatically to users, I need quite high precision, even if recall suffers (I think is the terminology). Oh, and ideally free, although not necessarily. My current preference is for dbpedia or freebase URIs, but wordnet is probably OK too. I think this must be something that there are people who have done this (a lot). Or at least there should be. There are certainly quite a lot of systems that can do it, some more or less playing well with Linked Data URIs. I think my problem (apart from laziness) is that the systems I look at seem to want me to care about what they do, or at least engage with tuning and things, which means I need some understanding of what they do, which I don't have (and I probably don't care either :-) ). So, does anyone (else) feel they can point me at a system for doing this that I can just use out of the box (possibly having been told some parameters to use)? Of course, maybe I am just asking too much of the technology at the moment, but I can hope! Best Hugh -- Hugh Glaser 20 Portchester Rise Eastleigh SO50 4QS Mobile: +44 75 9533 4155, Home: +44 23 8061 5652
Received on Friday, 17 January 2014 22:00:06 UTC