Re: gap analysis

Thanks Paul for this proposal for the gap analysis.
Twice you mention 'exposing' and i thought we could introduce 'querying' 
provenance too.

Also, maybe the gaps could be structured in content vs apis.
Like this, maybe.


Content:
- No common standard for expressing provenance information that captures 
processes as well as the other content dimensions.
- No guidance for how existing standards can be put together to provide 
provenance (e.g. linking to identity).

APIs (or protocols):
- No common API for obtaining/querying provenance information
- No guidance for how application developers should go about exposing 
provenance in their web systems.
- No well-defined standard for linking provenance between sites (i.e. 
trackback but for the whole web).


I also wondered whether they should be structured according to the 
provenance dimensions (so instead of API, break
this into Use/Management).

Luc



On 08/02/2010 12:04 PM, Paul Groth wrote:
> Hi All,
>
> As discussed at last week's telecon, I came up with some ideas about 
> the gaps necessary to realize the News Aggregator Scenario. I've put 
> these in the wiki and I append them below to help start the 
> discussion. Let me know what you think.
>
> Gap Analysis- News Aggregator
>
> For each step within the News Aggregator scenario, there are existing 
> technologies or relevant research that could solve that step. For 
> example, once can properly insert licensing information into a photo 
> using a creative commons license and the Extensible Metadata Platform. 
> One can track the origin of tweets either through retweets or using 
> some extraction technologies within twitter. However, the problem is 
> that across multiple sites there is no common format and api to access 
> and understand provenance information whether it is explicitly or 
> implicitly determined. To inquire about retweets or inquire about 
> trackbacks one needs to use different apis and understand different 
> formats. Furthermore, there is no (widely deployed) mechanism to point 
> to provenance information on another site. For example, once a tweet 
> is traced to the end of twitter there is no way to follow where that 
> tweet came from.
>
> Systems largely do not document the software by which changes were 
> made to data and what those pieces of software did to data. However, 
> there are existing technologies that allow this to be done. For 
> example, in a domain specific setting, XMP allows the transformations 
> of images to be documented. More general formats such as OPM, and PML 
> allow this to be expressed but are not currently widely deployed.
>
> Finally, while many sites provide for identity and their are several 
> widely deployed standards for identity (OpenId), there are no existing 
> mechanisms for tying identity to objects or provenance traces. This 
> directly ties to the attribution of objects and provenance.
>
> Summing up there are 4 existing gaps to realizing the News Aggregator 
> scenario:
>
> - No common standard to target for exposing and expressing provenance 
> information that captures processes as well as the other content 
> dimensions.
> - No well-defined standard for linking provenance between sites (i.e. 
> trackback but for the whole web).
> - No guidance for how exisiting standards can be put together to 
> provide provenance (e.g. linking to identity).
> - No guidance for how application developers should go about exposing 
> provenance in there web systems.
>
>
>

-- 
Professor Luc Moreau
Electronics and Computer Science   tel:   +44 23 8059 4487
University of Southampton          fax:   +44 23 8059 2865
Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
United Kingdom                     http://www.ecs.soton.ac.uk/~lavm

Received on Monday, 2 August 2010 11:35:22 UTC