Re: where does access time belong in the provenance dimension?

Hi,

I would agree with the definition of "provenance as the process that 
yielded an artifact". We've (PASOA project) used a similar definition in 
the past: "the provenance of a result is the process that led to that 
result".

I agree that retrieval is overloaded. So why don't we stick with Data 
Access under Process?

What do you think?
Paul

Olaf Hartig wrote:
> Hey Paul,
>
> On Monday 07 December 2009 08:55:31 Paul Groth wrote:
>    
>> Hi Olaf,
>>
>> So I agree with you that access time is another time. But I think it's
>> part of what I'll call the access process.
>> [...]
>> It may be a particular important process but it's a process none the less.
>> If we were to add a dimension I would therefore put it under process.
>>      
>
> Okay, I see how data access can be understood as a specific kind of process.
> On the other hand, many people seem to understand "process" as something
> during which things are created. For instance, in our wiki it says:
> "provenance as the process used to create a new artifact". Similarily, the OPM
> document defines process as an "Action or series of actions performed on or
> caused by artifacts, and resulting in new artifacts." Both notions of process
> do even stress that the things that are created are new. This is clearly not
> the fact for a data item that is retrieved from the Web during a data access
> process. Hence, in order to put data access under the Process dimension
> requires a broader understanding of "process". For this reason, I propose to
> adjust the wiki entry to "provenance as the process that yielded an artifact."
>
>    
>> Also I think the name "Data Access" maybe should be changed because we
>> already have an "Access" under the heading management.
>>      
>
> Any suggestions? The only thing that comes to my mind is "Retrieval" which
> could easily be confused with information retrieval and, thus, is not a good
> name.
>
> Greetings,
> Olaf
>
>    
>> Regards,
>> Paul
>>
>> Olaf Hartig wrote:
>>      
>>> Hey Paul,
>>>
>>> On Friday 04 December 2009 17:42:34 you wrote:
>>>        
>>>> Hi Olaf,
>>>>
>>>> It seems to me that the generation time of information is part of the
>>>> process (e.g.  b was generated from a version of x that was created at
>>>> 10:13) Thus, I think it belongs under the process dimension.
>>>>          
>>> I agree: the generation time (or creation time as I called it in the
>>> timeliness use case) belongs to the process dimension.
>>>
>>> However, the use case mentions another time: the access time. Both, b and
>>> c, were created by using x and before using x it had to be retrieved from
>>> the Web. The use case demonstrates that information about the access time
>>> might be relevant for timeliness assessment (due to missing information
>>> about the creation time of x in the case of Carol's data creation). The
>>> question is, to which of the dimensions in the Content category does the
>>> access time belong. I think it doesn't fit in one of the proposed
>>> dimensions. Instead, I suggest to add another dimension, called "Data
>>> Access", here. This dimension comprises all kinds of information about
>>> the access of data items on the Web. This includes not only access time
>>> but, for instance, information what server has been accessed as well as
>>> the provider/operator of the server. Such information might also be
>>> relevant in other information quality assessment scenarios not just
>>> timeliness. For instance, in the other use case discussed today - simple
>>> trustworthiness: here we have Alice providing a data publishing server.
>>> Someone may decide not to trust any data accessed from this server
>>> because he/she thinks Alice is not trustworthy and may have manipulated
>>> Bob's and Carol's data provided by her server. And again, it's not just
>>> about the access of the assessed data itself but also about the access of
>>> source data as the timeliness use case illustrates.
>>>
>>> Greetings,
>>> Olaf
>>>        
>
>    

Received on Monday, 7 December 2009 12:07:57 UTC