- From: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
- Date: Wed, 11 Nov 2015 21:30:49 +0100
- To: public-ld4lt@w3.org, "public-bpmlod@w3.org" <public-bpmlod@w3.org>
- Message-ID: <5643A579.7090402@cit-ec.uni-bielefeld.de>
Dear all, the LIDER project has developed guidelines describing how to discover and exploit language resources published as Linked Open Data. The guidelines are summarized here: https://www.w3.org/community/bpmlod/wiki/LLD_Exploitation As a motivating example, imagine a company developing sentiment analysis and opinion mining software that has a working system for the English language and wants to port the system to also support German. The company wants to find a corpus that is annotated at the sentiment level and extract a first seed lexicon of German subjective expressions with their polarity (positive, negative, neutral). How could Linked Data support them in finding an exploiting a German sentiment lexicon easily? According to our guidelines, they would perform the following steps: * Search and discovery: the company would enter the query "sentiment corpus German" into LingHub and reach the following page: http://linghub.lider-project.eu/search/?property=&query=sentiment+corpus+german. It would get two results. Clicking for instance on the usage review dataset it would reach the following page: http://linghub.lider-project.eu/datahub/usage-review-corpus#Nedfa753871df4052a5e6074d9389e901 * Licensing: They would would check the license http://opendatacommons.org/licenses/by/1.0/ and see that it is compatible with their purposes. * Distribution: The company would understand from the metadata page of the usage review dataset that a download is available at http://data.lider-project.eu/usage/usage.nt.gz and that a SPARQL endpoint is available at: http://data.lider-project.eu/usage/sparql * Extraction: Using the following SPARQL query SELECT ?string ?polarity WHERE { ?phrase <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#anchorOf> ?string ; <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#lang> <http://www.lexvo.org/page/iso639-3/deu> ; <http://www.gsi.dit.upm.es/ontologies/marl/ns#hasPolarity> ?polarity . } they could easily extract a list of German subjective phrases together with their polarity: http://data.lider-project.eu/usage/sparql?query=SELECT+%3Fstring+%3Fpolarity+WHERE+{%3Fphrase+%3Chttp%3A%2F%2Fpersistence.uni-leipzig.org%2Fnlp2rdf%2Fontologies%2Fnif-core%23anchorOf%3E+%3Fstring+%3B%0D%0A%3Chttp%3A%2F%2Fpersistence.uni-leipzig.org%2Fnlp2rdf%2Fontologies%2Fnif-core%23lang%3E+%3Chttp%3A%2F%2Fwww.lexvo.org%2Fpage%2Fiso639-3%2Fdeu%3E+%3B%0D%0A%3Chttp%3A%2F%2Fwww.gsi.dit.upm.es%2Fontologies%2Fmarl%2Fns%23hasPolarity%3E+%3Fpolarity+.} The company could then easily integrate these results into their workflow. Most importantly: they would accomplish this by using only open and non-proprietary technologies and web standards, and building on linked data principles. Any feedback on the guideline document is more than welcome! Kind regards, Philipp Cimiano -- -- Prof. Dr. Philipp Cimiano AG Semantic Computing Exzellenzcluster für Cognitive Interaction Technology (CITEC) Universität Bielefeld Tel: +49 521 106 12249 Fax: +49 521 106 6560 Mail: cimiano@cit-ec.uni-bielefeld.de Office CITEC-2.307 Universitätsstr. 21-25 33615 Bielefeld, NRW Germany
Received on Wednesday, 11 November 2015 20:31:21 UTC