- From: Paul Groth <pgroth@gmail.com>
- Date: Mon, 07 Dec 2009 13:07:21 +0100
- To: Olaf Hartig <hartig@informatik.hu-berlin.de>
- CC: public-xg-prov@w3.org
- Message-ID: <4B1CEFF9.5050304@gmail.com>
Hi, I would agree with the definition of "provenance as the process that yielded an artifact". We've (PASOA project) used a similar definition in the past: "the provenance of a result is the process that led to that result". I agree that retrieval is overloaded. So why don't we stick with Data Access under Process? What do you think? Paul Olaf Hartig wrote: > Hey Paul, > > On Monday 07 December 2009 08:55:31 Paul Groth wrote: > >> Hi Olaf, >> >> So I agree with you that access time is another time. But I think it's >> part of what I'll call the access process. >> [...] >> It may be a particular important process but it's a process none the less. >> If we were to add a dimension I would therefore put it under process. >> > > Okay, I see how data access can be understood as a specific kind of process. > On the other hand, many people seem to understand "process" as something > during which things are created. For instance, in our wiki it says: > "provenance as the process used to create a new artifact". Similarily, the OPM > document defines process as an "Action or series of actions performed on or > caused by artifacts, and resulting in new artifacts." Both notions of process > do even stress that the things that are created are new. This is clearly not > the fact for a data item that is retrieved from the Web during a data access > process. Hence, in order to put data access under the Process dimension > requires a broader understanding of "process". For this reason, I propose to > adjust the wiki entry to "provenance as the process that yielded an artifact." > > >> Also I think the name "Data Access" maybe should be changed because we >> already have an "Access" under the heading management. >> > > Any suggestions? The only thing that comes to my mind is "Retrieval" which > could easily be confused with information retrieval and, thus, is not a good > name. > > Greetings, > Olaf > > >> Regards, >> Paul >> >> Olaf Hartig wrote: >> >>> Hey Paul, >>> >>> On Friday 04 December 2009 17:42:34 you wrote: >>> >>>> Hi Olaf, >>>> >>>> It seems to me that the generation time of information is part of the >>>> process (e.g. b was generated from a version of x that was created at >>>> 10:13) Thus, I think it belongs under the process dimension. >>>> >>> I agree: the generation time (or creation time as I called it in the >>> timeliness use case) belongs to the process dimension. >>> >>> However, the use case mentions another time: the access time. Both, b and >>> c, were created by using x and before using x it had to be retrieved from >>> the Web. The use case demonstrates that information about the access time >>> might be relevant for timeliness assessment (due to missing information >>> about the creation time of x in the case of Carol's data creation). The >>> question is, to which of the dimensions in the Content category does the >>> access time belong. I think it doesn't fit in one of the proposed >>> dimensions. Instead, I suggest to add another dimension, called "Data >>> Access", here. This dimension comprises all kinds of information about >>> the access of data items on the Web. This includes not only access time >>> but, for instance, information what server has been accessed as well as >>> the provider/operator of the server. Such information might also be >>> relevant in other information quality assessment scenarios not just >>> timeliness. For instance, in the other use case discussed today - simple >>> trustworthiness: here we have Alice providing a data publishing server. >>> Someone may decide not to trust any data accessed from this server >>> because he/she thinks Alice is not trustworthy and may have manipulated >>> Bob's and Carol's data provided by her server. And again, it's not just >>> about the access of the assessed data itself but also about the access of >>> source data as the timeliness use case illustrates. >>> >>> Greetings, >>> Olaf >>> > >
Received on Monday, 7 December 2009 12:07:57 UTC