W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > January 2013

Re: ISSUE-72: Provenance Data Category - same for locQualityIssue?

From: Dave Lewis <dave.lewis@cs.tcd.ie>
Date: Sat, 26 Jan 2013 12:50:46 +0000
Message-ID: <5103D126.4020505@cs.tcd.ie>
To: Chase Tingley <chase@spartansoftwareinc.com>
CC: public-multilingualweb-lt-comments@w3.org, public-multilingualweb-lt@w3.org, kevin@spartanconsultinginc.com
Thanks Chase.

A logical follow-on question for LocQualityIssue implementors (as the 
other data category with stand off markup with multiple elements): 
Should we make the order of locQualityIssue element within a 
locQualityIssues stand off element reflect the order they were added in 
the same way?

i.e. after the definition of locQualityIssues we add  text:
"The order of its:locQualityIssue elements within a its:locQualityIssues 
element should reflect the order with which they were added to the 
document, with the most recently added one listed first."

Phil, guys?


On 25/01/2013 19:37, Chase Tingley wrote:
> Hi Dave,
> That sounds good.
> Thanks
> On Thu, Jan 24, 2013 at 12:41 AM, Dave Lewis <dave.lewis@cs.tcd.ie 
> <mailto:dave.lewis@cs.tcd.ie>> wrote:
>     Hi Chase,
>     Thanks for getting back to us on this.
>     In relation to ordering of its:provenanceRecord I propose
>     therefore to add the following sentence to the provenance section,
>     after we introduce this element:
>     "The order of its:provenanceRecord elements within a
>     its:provenanceRecords element should reflect the order with which
>     they were added to the document, with the most recently added one
>     listed first."
>     Can signal whether you are happy with this?
>     Then given, your comments also on the time annotation issue below,
>     I think I will be able to close this issue.
>     thanks again for this comment,
>     Regards,
>     Dave
>     On 23/01/2013 18:17, Chase Tingley wrote:
>>     Hi Dave & Pablo,
>>     Thanks for the responses.  Comments inline
>>     On Tue, Jan 22, 2013 at 5:39 PM, Dave Lewis <dave.lewis@cs.tcd.ie
>>     <mailto:dave.lewis@cs.tcd.ie>> wrote:
>>         Hi Chase, Kevin, all,
>>         First thanks to Pablo for his response. Some further
>>         responses inline below related to timing:
>>         On 15/01/2013 17:33, Pablo Nieto Caride wrote:
>>>         Hi Felix, all,
>>>         >ii) Similarly, does the ordering of provenance records within a
>>>         <provenanceRecords> element make a statement about the
>>>         (temporal) order in which the records were created?  If an
>>>         ordering is implied, it raises questions about the implied
>>>         ordering in a document where provenance records are declared
>>>         both globally and via local markup.
>>>         Certainly the spec does not talk about temporal order, but
>>>         given that records cannot be declared both globally and via
>>>         local markup for a single element, the way I see it, and to
>>>         simplify things, each provenance record should be older than
>>>         the previous one.
>>         I think the best we can do is offer best practice advice that
>>         the order with which more than one its:provenanceRecord are
>>         listed in its:provenanceRecords element should reflect the
>>         order they were added to the document rather than the order
>>         in which the translation(revision) actually happened.
>>         Pablo, could you confirm that you intend the oldest one to be
>>         listed last?
>>         I don't think we can mandate that the order indicated the
>>         order in which the activity indicated in the record
>>         (translation or translation revision) were preformed. This
>>         information may not be available to the processor adding the
>>         annotation. For example a TMS may add this annotation after
>>         receiving translation revisions from two different
>>         translators both for multiple elements but without per
>>         element timing information, so it wouldn't know the order in
>>         which the actual revisions were performed. Alternatively
>>         their timings may be known for different elements, but they
>>         overlap in time, so there wouldn't be an obvious order for
>>         the records.
>>     I think this makes sense.  It's more important to me that the
>>     overall semantics be clear than that the ordering work one way or
>>     another.  Just the knowledge that, for example, provenance
>>     records are more like a list than a bag is an important detail.
>>>         >iii) More generally, we observe that provenance records lack a
>>>         date/time attribute, which makes their semantics as a form
>>>         of history somewhat muddy.  In practice, a single tool/agent
>>>         may edit a single document multiple times in succession over
>>>         an arbitrary period of time.  Should these multiple
>>>         "sessions" be represented by a single logical provenance
>>>         record?  Or is it the intention of the spec that the agent
>>>         add a provenance record for each of these sessions in which
>>>         a modification is made to the document?
>>>         As I said in the previous point any modification of the
>>>         content should add a new provenance record, at least is what
>>>         I had in mind.
>>         The original requirements for the proveance data category
>>         primarily were intended to identifiy and differentiate the
>>         _agents_ involved in translation or revising translations
>>         different parts of a document. Its not clear what would be
>>         the best form of timing information. Should it be the period
>>         over which the agents conducted the translation(revison) or
>>         the instance in time at which they completed it. As indicated
>>         above, even just determining the ordering, let alone the
>>         absolute timing of the activity, can be complicated, and
>>         would require collection of this information to be pushed
>>         downstream to CAT tools that aren't otherwise ITS aware. This
>>         might present an implementation barrier if correct timing was
>>         mandated.
>>     Yes, you're right that this gets very messy when you consider
>>     aggregating provenance data from multiple agents that may have
>>     been processing in parallel.  The main point I wanted to clarify
>>     was that the purpose of the data category was to identify agents
>>     as opposed to "processing events".  I think this is enough for now.
>>     Thanks!
Received on Saturday, 26 January 2013 12:51:17 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:32:00 UTC