Re: ISSUE-72: Provenance Data Category

Hi Dave,

That sounds good.

Thanks

On Thu, Jan 24, 2013 at 12:41 AM, Dave Lewis <dave.lewis@cs.tcd.ie> wrote:

>  Hi Chase,
> Thanks for getting back to us on this.
>
> In relation to ordering of its:provenanceRecord I propose therefore to add
> the following sentence to the provenance section, after we introduce this
> element:
>
> "The order of its:provenanceRecord elements within a its:provenanceRecords
> element should reflect the order with which they were added to the
> document, with the most recently added one listed first."
>
> Can signal whether you are happy with this?
>
> Then given, your comments also on the time annotation issue below, I think
> I will be able to close this issue.
>
> thanks again for this comment,
> Regards,
> Dave
>
>
> On 23/01/2013 18:17, Chase Tingley wrote:
>
> Hi Dave & Pablo,
>
>  Thanks for the responses.  Comments inline
>
> On Tue, Jan 22, 2013 at 5:39 PM, Dave Lewis <dave.lewis@cs.tcd.ie> wrote:
>
>>  Hi Chase, Kevin, all,
>> First thanks to Pablo for his response. Some further responses inline
>> below related to timing:
>>
>> On 15/01/2013 17:33, Pablo Nieto Caride wrote:
>>
>>  Hi Felix, all,
>>
>>
>>
>>
>>
>> >ii) Similarly, does the ordering of provenance records within a
>> <provenanceRecords> element make a statement about the (temporal) order in
>> which the records were created?  If an ordering is implied, it raises
>> questions about the implied ordering in a document where provenance records
>> are declared both globally and via local markup.
>>
>>
>>
>> Certainly the spec does not talk about temporal order, but given that
>> records cannot be declared both globally and via local markup for a single
>> element, the way I see it, and to simplify things, each provenance record
>> should be older than the previous one.
>>
>>
>> I think the best we can do is offer best practice advice that the order
>> with which more than one its:provenanceRecord are listed in
>> its:provenanceRecords element should reflect the order they were added to
>> the document rather than the order in which the translation(revision)
>> actually happened.
>>
>> Pablo, could you confirm that you intend the oldest one to be listed
>> last?
>>
>> I don't think we can mandate that the order indicated the order in which
>> the activity indicated in the record (translation or translation revision)
>> were preformed. This information may not be available to the processor
>> adding the annotation. For example a TMS may add this annotation after
>> receiving translation revisions from two different translators both for
>> multiple elements but without per element timing information, so it
>> wouldn't know the order in which the actual revisions were performed.
>> Alternatively their timings may be known for different elements, but they
>> overlap in time, so there wouldn't be an obvious order for the records.
>>
>
>  I think this makes sense.  It's more important to me that the overall
> semantics be clear than that the ordering work one way or another.  Just
> the knowledge that, for example, provenance records are more like a list
> than a bag is an important detail.
>
>>
>>
>> >iii) More generally, we observe that provenance records lack a date/time
>> attribute, which makes their semantics as a form of history somewhat muddy.
>>  In practice, a single tool/agent may edit a single document multiple times
>> in succession over an arbitrary period of time.  Should these multiple
>> "sessions" be represented by a single logical provenance record?  Or is it
>> the intention of the spec that the agent add a provenance record for each
>> of these sessions in which a modification is made to the document?
>>
>>
>>
>> As I said in the previous point any modification of the content should
>> add a new provenance record, at least is what I had in mind.
>>
>>  The original requirements for the proveance data category primarily were
>> intended to identifiy and differentiate the _agents_ involved in
>> translation or revising translations different parts of a document. Its not
>> clear what would be the best form of timing information. Should it be the
>> period over which the agents conducted the translation(revison) or the
>> instance in time at which they completed it. As indicated above, even just
>> determining the ordering, let alone the absolute timing of the activity,
>> can be complicated, and would require collection of this information to be
>> pushed downstream to CAT tools that aren't otherwise ITS aware. This might
>> present an implementation barrier if correct timing was mandated.
>>
>
>  Yes, you're right that this gets very messy when you consider
> aggregating provenance data from multiple agents that may have been
> processing in parallel.  The main point I wanted to clarify was that the
> purpose of the data category was to identify agents as opposed to
> "processing events".  I think this is enough for now.
>
>  Thanks!
>
>
>
>

Received on Friday, 25 January 2013 19:38:31 UTC