W3C home > Mailing lists > Public > public-ld4lt@w3.org > October 2015

NIF-based NLP web services

From: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
Date: Tue, 27 Oct 2015 09:59:02 +0100
To: public-ld4lt@w3.org, "public-bpmlod@w3.org" <public-bpmlod@w3.org>, Benjamin Siemoneit <benjamin.siemoneit@yahoo.de>
Message-ID: <562F3CD6.60307@cit-ec.uni-bielefeld.de>
Dear LD4LT and BPMLOD communities,

the LIDER project has been developing guidelines for the implementation 
of NLP web services as RESTful NIF-based services that receive their 
input and return their output in the Natural Language Processing 
Interchange Format (NIF).

As a proof of concept, we have implemented the Stanford POS Tagger and 
the Stanford Parser as NIF-based services.

Assume you have the following example.ttl file describing a sentence in 
NIF (this document "describes" the sentence "This is a sample sentence"):

@prefixnif:<http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> 
.@prefixxsd:<http://www.w3.org/2001/XMLSchema#> 
.<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25>a 
nif:Context,nif:RFC5147String ,nif:Sentence;nif:isString "This is a 
sample 
sentence"^^xsd:string.<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,4>a 
nif:RFC5147String ,nif:Word;nif:anchorOf 
"This"^^xsd:string;nif:beginIndex "0"^^xsd:int;nif:endIndex 
"4"^^xsd:int;nif:nextWord 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=5,7> ;nif:sentence 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;nif:referenceContext 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> 
.<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=5,7>a nif:RFC5147String 
,nif:Word;nif:anchorOf "is"^^xsd:string;nif:beginIndex 
"5"^^xsd:int;nif:endIndex "7"^^xsd:int;nif:nextWord 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=8,9> ;nif:previousWord 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,4> ;nif:sentence 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;nif:referenceContext 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> 
.<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=8,9>a nif:RFC5147String 
,nif:Word;nif:anchorOf "a"^^xsd:string;nif:beginIndex 
"8"^^xsd:int;nif:endIndex "9"^^xsd:int;nif:nextWord 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=10,16> ;nif:previousWord 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=5,7> ;nif:sentence 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;nif:referenceContext 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> 
.<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=10,16>a nif:RFC5147String 
,nif:Word;nif:anchorOf "sample"^^xsd:string;nif:beginIndex 
"10"^^xsd:int;nif:endIndex "16"^^xsd:int;nif:nextWord 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=17,25> ;nif:previousWord 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=8,9> ;nif:sentence 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;nif:referenceContext 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> 
.<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=17,25>a nif:RFC5147String 
,nif:Word;nif:anchorOf "sentence"^^xsd:string;nif:beginIndex 
"17"^^xsd:int;nif:endIndex "25"^^xsd:int;nif:previousWord 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=10,16> ;nif:sentence 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;nif:referenceContext 
<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .


Using standard http requests, one can invoke the Stanford POS Tagger 
using curl:

curl -G 
http://sc-lider.techfak.uni-bielefeld.de/NifStanfordPOSTaggerWebService/NifStanfordPOSTagger 
-d v=true --data-urlencode i="$(<example.ttl)"

The result is a NIF-based Turtle file with the POS tags added (via the 
property nif:posTag).

Selecting "v=false" will return only the additional triples:

curl -G 
http://sc-lider.techfak.uni-bielefeld.de/NifStanfordPOSTaggerWebService/NifStanfordPOSTagger 
-d v=false --data-urlencode i="$(<example.ttl)"

Assuming that the result is stored in a file input.ttl, we can now 
invoke the Stanford parser as follows:

curl -G 
http://sc-lider.techfak.uni-bielefeld.de/NifStanfordParserWebService/NifStanfordParser 
-d v=true --data-urlencode i="$(<input.ttl)"

As a result we would receive a NIF-based Turtle file in which the 
dependency relations have been added (via the property nif:dependency).

This shows nicely how one can incrementally enrich an original document 
by additional annotations monotonically.

Finally, it is easy to chain the services using standard http and curl 
as follows:

curl -G 
http://sc-lider.techfak.uni-bielefeld.de/NifStanfordPOSTaggerWebService/NifStanfordPOSTagger 
-d v=false --data-urlencode i="$(<example.ttl)"curl -G 
http://sc-lider.techfak.uni-bielefeld.de/NifStanfordPOSTaggerWebService/NifStanfordPOSTagger 
-d v=false --data-urlencode i="$(<example.ttl)"

The guidelines for the implementation of such services can be found here:

http://bpmlod.github.io/report/NIF-based-NLP-WebServices/index.html

We welcome feedbak on these guidelines!

Kind regards,

Philipp.

-- 
--
Prof. Dr. Philipp Cimiano
AG Semantic Computing
Exzellenzcluster für Cognitive Interaction Technology (CITEC)
Universität Bielefeld

Tel: +49 521 106 12249
Fax: +49 521 106 6560
Mail: cimiano@cit-ec.uni-bielefeld.de

Office CITEC-2.307
Universitätsstr. 21-25
33615 Bielefeld, NRW
Germany
Received on Tuesday, 27 October 2015 08:59:34 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 27 October 2015 08:59:34 UTC