W3C home > Mailing lists > Public > public-multilingualweb-lt-comments@w3.org > January 2013

Re: ISSUE-72: Provenance Data Category

From: Dave Lewis <dave.lewis@cs.tcd.ie>
Date: Wed, 23 Jan 2013 01:39:44 +0000
Message-ID: <50FF3F60.3040405@cs.tcd.ie>
To: Pablo Nieto Caride <pablo.nieto@linguaserve.com>
CC: Felix Sasaki <fsasaki@w3.org>, public-multilingualweb-lt-comments@w3.org, public-multilingualweb-lt@w3.org, kevin@spartanconsultinginc.com, chase@spartanconsultinginc.com
Hi Chase, Kevin, all,
First thanks to Pablo for his response. Some further responses inline 
below related to timing:

On 15/01/2013 17:33, Pablo Nieto Caride wrote:
> Hi Felix, all,
> >ii) Similarly, does the ordering of provenance records within a <provenanceRecords> element make a 
> statement about the (temporal) order in which the records were 
> created?  If an ordering is implied, it raises questions about the 
> implied ordering in a document where provenance records are declared 
> both globally and via local markup.
> Certainly the spec does not talk about temporal order, but given that 
> records cannot be declared both globally and via local markup for a 
> single element, the way I see it, and to simplify things, each 
> provenance record should be older than the previous one.

I think the best we can do is offer best practice advice that the order 
with which more than one its:provenanceRecord are listed in 
its:provenanceRecords element should reflect the order they were added 
to the document rather than the order in which the translation(revision) 
actually happened.

Pablo, could you confirm that you intend the oldest one to be listed last?

I don't think we can mandate that the order indicated the order in which 
the activity indicated in the record (translation or translation 
revision) were preformed. This information may not be available to the 
processor adding the annotation. For example a TMS may add this 
annotation after receiving translation revisions from two different 
translators both for multiple elements but without per element timing 
information, so it wouldn't know the order in which the actual revisions 
were performed. Alternatively their timings may be known for different 
elements, but they overlap in time, so there wouldn't be an obvious 
order for the records.

> >iii) More generally, we observe that provenance records lack a date/time attribute, which makes their 
> semantics as a form of history somewhat muddy.  In practice, a single 
> tool/agent may edit a single document multiple times in succession 
> over an arbitrary period of time.  Should these multiple "sessions" be 
> represented by a single logical provenance record?  Or is it the 
> intention of the spec that the agent add a provenance record for each 
> of these sessions in which a modification is made to the document?
> As I said in the previous point any modification of the content should 
> add a new provenance record, at least is what I had in mind.

The original requirements for the proveance data category primarily were 
intended to identifiy and differentiate the _agents_ involved in 
translation or revising translations different parts of a document. Its 
not clear what would be the best form of timing information. Should it 
be the period over which the agents conducted the translation(revison) 
or the instance in time at which they completed it. As indicated above, 
even just determining the ordering, let alone the absolute timing of the 
activity, can be complicated, and would require collection of this 
information to be pushed downstream to CAT tools that aren't otherwise 
ITS aware. This might present an implementation barrier if correct 
timing was mandated.

You are right that in general the term 'provenance' typically implies 
some timing information, however, many of these issues have already been 
addressed by another W3C WG that has produced a dedicated provenance 
logging model, see: http://www.w3.org/2011/prov/wiki/Main_Page

This model accomodates period and instance timing information in a form 
that can be formally reasoned over. It also accomodates situations where 
knowledge of timing of activities is incomplete. Rather than attempt to 
replicate all these features in ITS, we instead include the option to 
point to an external provenance data record (provRef)and recommend that 
implementors use the W3C proveance specification for such external 
records when collection of timing information is required. We will 
recommend for usage of this provenance spec for external record as best 
practice at a later date.

Chase, Kevin, does this address your comment?

Dave Lewis

> *__________________________________*
> *Pablo Nieto Caride*
> *Dpto. Técnico/I+D+i*
> *Linguaserve Internacionalización de Servicios, S.A.*
> *Tel.: +34 91 761 64 60 ext. 0422
> Fax: +34 91 542 89 28 *
> *E-mail: **pablo.nieto@linguaserve.com 
> <mailto:pablo.nieto@linguaserve.com>***
> *www.linguaserve.com <http://www.linguaserve.com/>*
> **
> *«En cumplimiento con lo previsto con los artículos 21 y 22 de la Ley 
> 34/2002, de 11 de julio, de Servicios de la Sociedad de Información y 
> Comercio Electrónico, le informamos que procederemos al archivo y 
> tratamiento de sus datos exclusivamente con fines de promoción de los 
> productos y servicios ofrecidos por LINGUASERVE INTERNACIONALIZACIÓN 
> DE SERVICIOS, S.A. En caso de que Vdes. no deseen que procedamos al 
> archivo y tratamiento de los datos proporcionados, o no deseen recibir 
> comunicaciones comerciales sobre los productos y servicios ofrecidos, 
> comuníquenoslo a clients@linguaserve.com, y su petición será 
> inmediatamente cumplida.»*
> **
> *"According to the provisions set forth in articles 21 and 22 of Law 
> 34/2002 of July 11 regarding Information Society and eCommerce 
> Services, we will store and use your personal data with the sole 
> purpose of marketing the products and services offered by LINGUASERVE 
> personal data to be stored and handled, or you do not wish to receive 
> further information regarding products and services offered by our 
> company, please e-mail us to clients@linguaserve.com. Your request 
> will be processed immediately."*
> *__________________________________*
Received on Wednesday, 23 January 2013 01:40:28 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:32:26 UTC