Provenance challenge


At the end of the call today we discussed provenance as a key kind of 
metadata we would like to access through the metadata part of an LSID 
(or ARK or whatever). I said I would post round the call for the 
Provenance Challenge that is ongoing...See below.

As I do this I wonder if we should have an Identity Challenge.

We produce a couple of data sets that simulate real data sets; we 
produce some application scenarios that reflect changes in the 
databases, accessing the data, moving the dataset etc and then we try 
out LSID, ARK and whatever identity scheme you fancy. Then we would have 
some concrete comparison of capability, cost of take-on, various 
metadata vs data scenarios, versioning schemes etc.... just a thought.

(now on vacation with the rest of Europe...)

Call for Participation: First Provenance Challenge

Provenance is a critical concept in scientific workflows, since it 
allows scientists to understand the origin of their results, to repeat 
their experiments, and to validate the processes that were used to 
derive data products.  During a discussion on provenance standardization 
at the International Provenance and Annotation Workshop (IPAW'06,, the community decided that it needs to understand the 
different representations used for provenance, its common aspects, and 
the reasons for its differences.  As a result, the community agreed that 
a Provenance Challenge should be set to compare and understand existing 

Participants of the challenge are presented with an example experiment 
workflow design, using publicly available tools, and data.  They are 
free to implement this workflow as they prefer, e.g. using their own 
workflow enactment system, as batch scripts etc.  There are then a 
series of queries about the provenance of the experiment results, for 
which each participant shows how they would answer the queries using 
their system.

The challenge will conclude with a workshop held at Global Grid Forum 18 
(GGF/OGF-18) in Washington DC in September, where we will discuss the 
results of the challenge and compare approaches.  Details of GGF-18 are 
available at

Full details on the challenge, and how to participate, are available at:

Any questions about the challenge are welcome. Please send them to


Simon Miles, Luc Moreau, Mike Wilde, Ian Foster, Yong Zhao

Universities of Southampton and Chicago

Received on Monday, 31 July 2006 18:53:07 UTC