Release of sar-graph 1.0

Apologies for cross-posting
Please forward this message to colleagues in the areas of interest

    RELEASE OF sar-graph 1.0

The following resource will be released on META-SHARE and is already available as a pre-release at

A sar-graph is a graph containing linguistic knowledge at syntactic and lexical semantic levels for a given language and target relation. a sar-graph for a targeted relation assembles many linguistic patterns that are used in texts to mention this relation.  The term semantically associated relations graph was chosen since the patterns may either express the target relation directly or by expressing a semantically associated relation. The nodes in a sar-graph are either semantic arguments of a target relation or content words (to be more exact, their word senses) needed to express/recognize an instance of the target relation. The nodes are connected by two kinds of edges: syntactic dependency structure relations and lexical semantic relations. Thus they are labelled with dependency-structure tags provided by a parser or lexical-semantic relation tags. A definition can be found in (Uszkoreit and Xu, 2013). The individual patterns are assembled in one graph per target relation for an easier combination of mentions gathered across sentences, but all patterns could also be employed individually. 

From Strings to Things  
SAR-Graphs: A New Type of Resource for Connecting Knowledge and Language
Hans Uszkoreit and Feiyu Xu (2013) 
In Proceedings of 1st International Workshop on NLP and DBpedia (NLP&DBPedia),  volume 1064, Sydney, NSW, Australia, CEUR Workshop Proceedings, 10/2013

The current sar-graph version 1.0 contains syntactic dependency relations between content words. In future versions, we will integrate lexical semantic relations between word senses.  

In the current version, the patterns have been automatically learned by the web-scale version Web-DARE (Krause et al., 2012) of the relation extraction system DARE  (Xu et al., 2007) from dependency structures obtained by parsing sentential  mentions of the target relation. DARE patterns contain the content words that signal the mentioned (semantically associated) relation and by the syntactic dependencies that combine these words and link them with the phrases representing the arguments of the target relation. Thus, a sar-graph is composed of syntactic dependency graphs. Their edges denote dependency relations. Each edge is labeled with the tag the parser has assigned to the dependency. Vertices come in two flavors: One type of vertices denotes a regular node in a dependency structure, thus it is labeled with a word. Vertices of the second type represent the slots for the arguments of the target relation, instead of a word, they are labeled by the name of the argument, e.g. Person_1. Several dependency parsers have been employed, but the current set of sar-graphs is built from parsing results of the MALT parser.

Applications of sar-graphs are information extraction, question answering and summarisation. 
The resource might also be useful for research on paraphrases, textual entailment and syntactic variation within a language.

Release 1.0 has the following properties:

Language: English 
Number of target relations: 25
Arity of relations: n-ary relations (2≤n≤5)
Domains of relations: biographic information, corporations, awards
Format of patterns: DARE patterns in lemon format and specific xml schema (DTD provided)
Format of sar-graphs: specific xml schema  (DTD provided)
APIs: java api for reading and storing patterns and sar-graphs,
 java api for various use cases: getting and searching for vertex, edge information of a DARE pattern and a sar-graph,
 java api for pattern visualization

Download is available at:
More statistics are available at:
More references can be found at:
Feedback via email:

sar-graphs were conceived and defined at DFKI LT-Lab Berlin and then realized in 
a collaboration between DFKI LT-Lab and the BabelNet group at Sapienza University of Rome.

The development of sar-graphs is partially supported by 
 • the German Federal Ministry of Education and Research (BMBF) through the project Deependance (contract 01IW11003)
 • the project LUcKY, a Google Focused Research Award in the area of Natural Language Understanding. 


Feiyu Xu

Dr. Feiyu Xu

Senior Researcher
Project Leader

DFKI  Projektbüro Berlin
Alt Moabit 91c
D-10559 Berlin
Phone +49-30-23895-1812
Sek      +49-30-23895-1800
Fax      +49-30-23895-1810




Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff

Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes

Amtsgericht Kaiserslautern, HRB 2313


Received on Tuesday, 15 July 2014 15:20:22 UTC