RE: NIF-based NLP web services

Hello, Philipp!

There is a H2020 project – FREME – which exploits NIF actively. We use NIF as a communication format among our e-services to ensure interoperability. Felix is the project coordinator and could provide you with detailed info.

With best wishes,
Tatjana

From: Philipp Cimiano [mailto:cimiano@cit-ec.uni-bielefeld.de]
Sent: otrdiena, 2015. gada 27. oktobris 10:59
To: public-ld4lt@w3.org; public-bpmlod@w3.org; Benjamin Siemoneit <benjamin.siemoneit@yahoo.de>
Subject: NIF-based NLP web services

Dear LD4LT and BPMLOD communities,

the LIDER project has been developing guidelines for the implementation of NLP web services as RESTful NIF-based services that receive their input and return their output in the Natural Language Processing Interchange Format (NIF).

As a proof of concept, we have implemented the Stanford POS Tagger and the Stanford Parser as NIF-based services.

Assume you have the following example.ttl file describing a sentence in NIF (this document "describes" the sentence "This is a sample sentence"):



@prefix nif:   <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .

@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .



<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25>

a             nif:Context , nif:RFC5147String , nif:Sentence ;

nif:isString  "This is a sample sentence"^^xsd:string .



<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,4>

a                     nif:RFC5147String , nif:Word ;

nif:anchorOf          "This"^^xsd:string ;

nif:beginIndex        "0"^^xsd:int ;

nif:endIndex          "4"^^xsd:int ;

nif:nextWord          <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=5,7> ;

nif:sentence            <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;

nif:referenceContext  <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .



<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=5,7>

a                     nif:RFC5147String , nif:Word ;

nif:anchorOf          "is"^^xsd:string ;

nif:beginIndex        "5"^^xsd:int ;

nif:endIndex          "7"^^xsd:int ;

nif:nextWord          <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=8,9> ;

nif:previousWord      <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,4> ;

nif:sentence            <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;

nif:referenceContext  <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .



<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=8,9>

a                     nif:RFC5147String , nif:Word ;

nif:anchorOf          "a"^^xsd:string ;

nif:beginIndex        "8"^^xsd:int ;

nif:endIndex          "9"^^xsd:int ;

nif:nextWord          <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=10,16> ;

nif:previousWord      <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=5,7> ;

nif:sentence            <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;

nif:referenceContext  <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .



<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=10,16>

a                     nif:RFC5147String , nif:Word ;

nif:anchorOf          "sample"^^xsd:string ;

nif:beginIndex        "10"^^xsd:int ;

nif:endIndex          "16"^^xsd:int ;

nif:nextWord          <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=17,25> ;

nif:previousWord      <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=8,9> ;

nif:sentence            <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;

nif:referenceContext  <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .



<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=17,25>

a                     nif:RFC5147String , nif:Word ;

nif:anchorOf          "sentence"^^xsd:string ;

nif:beginIndex        "17"^^xsd:int ;

nif:endIndex          "25"^^xsd:int ;

nif:previousWord      <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=10,16> ;

nif:sentence            <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;

nif:referenceContext  <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .

Using standard http requests, one can invoke the Stanford POS Tagger using curl:

curl -G http://sc-lider.techfak.uni-bielefeld.de/NifStanfordPOSTaggerWebService/NifStanfordPOSTagger -d v=true --data-urlencode i="$(<example.ttl)"

The result is a NIF-based Turtle file with the POS tags added (via the property nif:posTag).

Selecting "v=false" will return only the additional triples:

curl -G http://sc-lider.techfak.uni-bielefeld.de/NifStanfordPOSTaggerWebService/NifStanfordPOSTagger -d v=false --data-urlencode i="$(<example.ttl)"

Assuming that the result is stored in a file input.ttl, we can now invoke the Stanford parser as follows:

curl -G http://sc-lider.techfak.uni-bielefeld.de/NifStanfordParserWebService/NifStanfordParser -d v=true --data-urlencode i="$(<input.ttl)"

As a result we would receive a NIF-based Turtle file in which the dependency relations have been added (via the property   nif:dependency).

This shows nicely how one can incrementally enrich an original document by additional annotations monotonically.

Finally, it is easy to chain the services using standard http and curl as follows:

curl -G http://sc-lider.techfak.uni-bielefeld.de/NifStanfordPOSTaggerWebService/NifStanfordPOSTagger -d v=false --data-urlencode i="$(<example.ttl)"curl -G http://sc-lider.techfak.uni-bielefeld.de/NifStanfordPOSTaggerWebService/NifStanfordPOSTagger -d v=false --data-urlencode i="$(<example.ttl)"

The guidelines for the implementation of such services can be found here:

http://bpmlod.github.io/report/NIF-based-NLP-WebServices/index.html


We welcome feedbak on these guidelines!

Kind regards,

Philipp.


--

--

Prof. Dr. Philipp Cimiano

AG Semantic Computing

Exzellenzcluster für Cognitive Interaction Technology (CITEC)

Universität Bielefeld



Tel: +49 521 106 12249

Fax: +49 521 106 6560

Mail: cimiano@cit-ec.uni-bielefeld.de<mailto:cimiano@cit-ec.uni-bielefeld.de>



Office CITEC-2.307

Universitätsstr. 21-25

33615 Bielefeld, NRW

Germany

Received on Tuesday, 27 October 2015 13:04:40 UTC