- From: William Waites <ww@styx.org>
- Date: Mon, 20 Sep 2010 14:44:04 +0100
- To: Antoine Isaac <aisaac@few.vu.nl>
- CC: Toby Inkster <tai@g5n.co.uk>, Ian Davis <lists@iandavis.com>, semantic-web@w3.org, public-lod@w3.org, Jacco van Ossenbruggen <Jacco.van.Ossenbruggen@cwi.nl>, Mark van Assem <mark@cs.vu.nl>
- Message-ID: <4C976524.7000107@styx.org>
On 10-09-20 12:45, Antoine Isaac wrote: > Very interesting! I'm curious though: what's the application scenario > that made you create this version? (hopefully this is closely enough related that my reply below isn't a non-sequitur) I worked on a toy NLP bot that might expose some "real" uses for representing natural language in RDF [0]. The basic premise was to allow users to describe bibliographic data (works and authors and such) in simple natural language sentences and have it output RDF (FRBR-esque) [1]. (Motivated partly by the fact that I am terrible at user interface design and had a very hard time trying to make a web interface that allowed users to enter data with anything other than a very simple structure). One vocabulary that I missed while doing this is something to represent parts of speech and grammatical syntax in natural language. I invented something ad-hoc but it might be useful to have a more completely thought out way to do this. You can see some examples in the first link. > How do you make the distinction between the two situations--I mean, > based on which elements in the Wordnet data? The approach that I took -- and keep in mind this was a toy, I have doubts about the scalability doing things this way was to (1) parse the natural language sentence into an annotated syntax tree as an intermediate form (represented in RDF) and then (2) run specially crafted N3 inference rules over it to generate the desired output. The inference rules encode the semantic relationships between concepts existing in (or across) sentences. I mostly worked with inference rules that hinged on the main verb in the sentence (which also happens to be the top of the syntax tree). In principle, with a complete enough set of such inference rules (most likely restricted to a particular domain of discourse, a truly general set would be very hard if it is possible at all) would resolve the ambiguity. In the case that makes sense there would be useful entailments, in the case that doesn't there wouldn't. I saw this kind of resolution of syntactic ambiguity happen a couple of times. Resolution of homonyms might work similarly. I'm not so sure the structure of creating a class hierarchy based on orthographical accident makes sense. Where the words do have a common conceptual root, certainly. But in the "crack" example I don't think so. They are (probably) completely different concepts that just happen to be denoted by the same string. I might be wrong but I don't think that wordnet contains enough information to make this choice. Cheers, -w [0] http://blog.okfn.org/2010/08/09/cataloguing-bibliographic-data-with-natural-language-and-rdf/ [1] http://pastebin.ca/1913826 -- William Waites <ww@styx.org> Mob: +44 789 798 9965 Fax: +44 131 464 4948 CD70 0498 8AE4 36EA 1CD7 281C 427A 3F36 2130 E9F5
Received on Monday, 20 September 2010 13:45:55 UTC