Call for Participation: BigPROV @EDBT 2013: Workshop on Managing and Querying Provenance Data at Scale

[Apologies for cross-posting, but please help disseminate!]

               * CALL FOR PARTICIPATION *

BIGPROV'13: International Workshop on Managing and Querying Provenance Data at Scale
     In conjunction with EDBT/ICDT 2013:
              * March 22nd, 2013, Genova, Italy *

 Confirmed Keynote speakers

- Dr. Joerg Kraenzlein, MD, PhD -- Medical Director, Big Data, iSOFT Health GmbH, a CSC company
- Dr. Grigoris Karvounarakis, LogicBlox, USA

 BigProv Research Papers

- Paraskevi Zerva, Steffen Zschaler and Simon Miles. Towards Design Support for Provenance Awareness: A Classification of Provenance
- Pinar Alper, Khalid Belhajjame, Carole A. Goble and Pinar Karagoz. Enhancing and Abstracting Scientific Workflow Provenance for
Data Publishing
- Devarshi Ghoshal and Beth Plale. Provenance from Log Files: a BigData Problem
- Alessandro Spinuso, James Cheney and Malcolm Atkinson. Provenance for seismological processing pipelines in a distributed
streaming workflow
- Flavio Costa, Victor Sousa, Daniel de Oliveira, Kary Ocana, Eduardo Ogasawara, Jonas Dias and Marta Mattoso. Capturing and
Querying Workflow Runtime Provenance with PROV: a Practical Approach   
- Edoardo Pignotti, Gary Polhill and Peter Edwards. Using Provenance to Analyse Agent-based Simulations
- Bernd Amann, Camelia Constantin, Clement Caron and Patrick Giroux. WebLab PROV: Computing fine-grained provenance links for XML

 ProvBench (Short Presentations)

- David Koop, Juliana Freire, Claudio Silva, VisTrails Provenance Traces for Benchmarking Fernando Chirigati,
- Provenance Traces of the Swift Parallel Scripting System, Luiz M. R. Gadelha Jr., Michael Wilde, Marta Mattoso, Ian Foster
- Paolo Missier and Ziyu Chen. Extracting PROV provenance traces from Wikipedia history pages
- Edoardo Pignotti, Gary Polhill, Peter Edward, PROV-O Provenance Traces From Agent-based Social Simulation
- Ashish Gehani, Dawood Tariq, Cross-Platform Provenance
- Felipe Horta, Vitor Silva, Flavio Costa, Daniel de Oliveira, Kary Ocana, Eduardo Ogasawara, Jonas Dias, Marta Mattoso, Provenance
Traces from Chiron Parallel Workflow Engine
- Heiko Mueller, Chris Peters, Yanfeng Shu, Andrew Terhorst, Provenance in Streamflow Forecastinga
- Khalid Belhajjame, Jun Zhao, Daniel Garijo, Aleix Garrido, Stian Soiland-Reyes and Pinar Alper. The Taverna and Wings Workflow

Provenance data is poised to become pervasive in key areas of information management, ranging from traditional areas of science
(i.e., life sciences, earth sciences, astronomy, etc.), to new applications enabled by the Web (e.g., social sciences, social
network analysis, quality and trust in Web publishing).

As the volume of provenance metadata increases with the volume of the underlying data whose history it describes, new challenges for
managing and querying provenance at scale emerge, i.e., provenance data is growing in both "count" and "complexity". It is growing
in count because of the very large number of provenance traces (one for each Twitter message, for example), and in complexity in the
case of  provenance graphs that are generated from provenance-enabled programming environments (e.g., scientific workflow systems)
and middleware. Data-intensive science is bound to produce provenance that fares high on both accounts.

At the same time, emerging standards such as PROV, the W3C recommendation for provenance modelling and Web-based access, suggest
that provenance data will increasingly be encoded using Semantic Web technology. This in turn suggests that provenance data will
soon form a natural extension of, and seamlessly blend with, the growing Linked Data Cloud. 

The new Managing and Querying Provenance Data at Scale workshop (BIGProv) stems from these premises. We are interested in exploring
the system and modelling challenges associated with collecting, storing, querying, and exploiting large volumes of possibly complex
provenance data. We seek to map the state of the art, elicit new research problems, and learn about existing systems. More
specifically, the workshop scope includes the following topics:

- Automated capture of provenance at multiple layers (system, middleware, applications)
- Database models, languages, and systems for storing and querying large-scale provenance
- Provenance and Linked Open Data (LOD): seamless representation and query models
- Comparison and performance benchmarking of different data architectures and query models for provenance
- Analysis of existing graph query models and systems for provenance graphs
- Reference datasets for provenance benchmarking
- System descriptions and demonstrations of large-scale provenance and graph data
- Uniform querying over heterogeneous provenance traces
- Abstraction models for provenance and their applications to user presentation, visualization, and privacy preservation

Accepted papers are included in the ACM DL as well as in the official EDBT workshop proceedings.

 Workshop Organizers

Bertram Ludaescher, UC Davis, CA (
Paolo Missier, Newcastle University, UK (

Proceedings chair:  Victor Cuevas, University of New Mexico and UC Davis, USA

-----------  ~oo~  --------------
Paolo Missier -, 
School of Computing Science, Newcastle University,  UK
PGP Public key: 0x45596549  - key servers:

Received on Tuesday, 29 January 2013 17:49:13 UTC