Re: [ACTION-79]Consider consolidation of status-related data categories and process trigger

Hi Pedro,
Sorry, I didn't yet fill in the details of how I thought this might work 
for cache status, which would simply be:

  * The original content is not saved in the cache (i.e., it is new or
    has been updated): (re)translation is needed

the source document or element would have attribute:

ready-to-process  = cache-source
ready-at = <the time at which it would be ready to cache>

  * The translated content is not saved in the cache (i.e., it has not
    been previously translated or has expired): translation is needed

the translation document or element would have attributes:

ready-to-process = cache-target
ready-at = <the time at which it would be ready to cache>

  * Neither the original nor the translated page are saved in the cache:
    both need to be cached

you could either have both the above, or in cases where the source and 
target are in the same file use:

ready-to-process = cache-source-and-target
ready-at = <the time at which it would be ready to cache>

Note, there is a revised flag there that could also be used if useful

So, if I understand this right I think the  readiness attributes would 
provide equivalent meta-data. However, if you think this is a distinct 
use case, i.e. implementors who would implement the cacheStatus are 
specifically only interested in that functionally and would be unlikely 
to also implement a more general readiness data category, then 
definitely we should be considering a separate data category.

cheers,
Dave


On 07/05/2012 18:32, Pedro L. Díez Orzas wrote:
>
> Hi Dave,
>
> I will look at it very carefully as soon as I can, since they are 
> really major changes, but a priori I do not understand why to 
> consolidate and to remove cacheStatus, since for me this is a 
> completely different metadata than processTrigger, processStatus or 
> other "status" that answers completely different requirements.
>
> As I explained in the notes and definition of cacheStatus, this 
> metadata is not for localization chain o whatever localisation 
> process, but for real time translation systems and their caching 
> needs. In this respect I would put it again as it was (if you want it 
> can called only "cache", without "status") and sorry for any confusion 
> I could produce about it.
>
> Best,
>
> Pedro
>
> *__________________________________***
>
> **
>
> *Pedro L. Díez Orzas*
>
> *Presidente Ejecutivo/CEO*
>
> *Linguaserve Internacionalización de Servicios, S.A.*
>
> *Tel.: +34 91 761 64 60
> Fax: +34 91 542 89 28 *
>
> *E-mail: **pedro.diez@linguaserve.com 
> <mailto:pedro.diez@linguaserve.com>***
>
> *www.linguaserve.com <http://www.linguaserve.com/>*
>
> **
>
> «En cumplimiento con lo previsto con los artículos 21 y 22 de la Ley 
> 34/2002, de 11 de julio, de Servicios de la Sociedad de Información y 
> Comercio Electrónico, le informamos que procederemos al archivo y 
> tratamiento de sus datos exclusivamente con fines de promoción de los 
> productos y servicios ofrecidos por LINGUASERVE INTERNACIONALIZACIÓN 
> DE SERVICIOS, S.A. En caso de que Vdes. no deseen que procedamos al 
> archivo y tratamiento de los datos proporcionados, o no deseen recibir 
> comunicaciones comerciales sobre los productos y servicios ofrecidos, 
> comuníquenoslo a clients@linguaserve.com, y su petición será 
> inmediatamente cumplida.»
>
> "According to the provisions set forth in articles 21 and 22 of Law 
> 34/2002 of July 11 regarding Information Society and eCommerce 
> Services, we will store and use your personal data with the sole 
> purpose of marketing the products and services offered by LINGUASERVE 
> INTERNACIONALIZACIÓN DE SERVICIOS, S.A. If you do not wish your 
> personal data to be stored and handled, or you do not wish to receive 
> further information regarding products and services offered by our 
> company, please e-mail us to clients@linguaserve.com. Your request 
> will be processed immediately."
>
> *____________________________________***
>
> ------------------------------------------------------------------------
>
> *De:*David Lewis [mailto:dave.lewis@cs.tcd.ie]
> *Enviado el:* lunes, 07 de mayo de 2012 14:51
> *Para:* public-multilingualweb-lt@w3.org
> *Asunto:* Re: [ACTION-79]Consider consolidation of status-related data 
> categories and process trigger
>
> Hi Pedro, Guys,
> Following the previous discussion on the proposal for consolidation 
> around these data categories I have now made the following changes to 
> the requirements document.
>
> Pedro, as discussed on Friday's call could you and any other 
> interested parties examine these changes and flag anything issues on 
> this thread.
>
> 1) I have update processTrigger and changed its name to 'readiness' as 
> previously discussed
> http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#readiness
>
> 2) I have moved the need for a process model to a new requirement to 
> reflect its relevance to several of the other data categories, 
> including readiness, progress-indicator and provenance, and it need 
> for further careful consideration:
> http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#Process_Model
>
> 3) As part of this consolidation I have removed the data categories of:
> processTrigger, cacheStatus, legalStatus, processState, 
> proofreadingState and revision state
>
> 4) I've updated the data category tables and the related interests 
> accordingly
>
> 5) I've highlighted issues (in bold below) to consider about the 
> following properties of the removed processTrigger that are no longer 
> present (as recorded in the notes for the readiness data category)
>
>   * /contentType/, values: MIME or custom values - This indicates the
>     format or the type of the content used in the content in order to
>     apply the right filter or normalization rules, and the subsequent
>     processes. For example, to express HTML we could use:
>     "contentType: text/html: *consider consolidation with formatType
>     or languageResource*
>
> >> Not agree, unless formatType refers really to computer format and 
> not like now: about the format or service for which the content is 
> produced (e.g., subtitles, spoken text)
>
>   * /sourceLang/-- value: standard ISO 639 value - this value
>     indicates the source language for the current translation
>     requested. It is different from the sourceLanguage (provenance)
>     Data Category , since this indicates the language the original
>     source text was and sourceLang indicates the current source
>     language to be used for the translation that can be different from
>     the original source - *this should be considered as an attribute
>     for proveance*
>   * /contentResultSource/ --value: yes / no. Indicates the format if
>     the Localisation chain needs to give back the original - *is this
>     necessary as an attribute here or as a separate attribute*
>   * /contentResultTarget/ -- value: monolingual, multilingual;
>     indicates if the resulting translation, in the cases of several
>     target languages, should be delivered in several monolingual
>     content files or in a single multilingual content file *this would
>     require a more general purpose return file indicator*
>   * /pivotLang/ - value: standard ISO value. Indicates the
>     intermediate language in the case is needed. Two examples: 1)
>     Going from a source language to two language variants (eg. into
>     Brazil and Portugal Portuguese), it is more cost-effective to go
>     to one first (being this first variant a "pivot" language) and to
>     revise later to the second variant; Going from one language to
>     another via an intermediate language (eg. from Maltese into
>     English and from English into Irish, because there is not direct
>     Maltese into Irish available translation). - *consider
>     consolidation with source language, , i.e. it is an attibute of
>     the source language*
>
>
> Regards,
> Dave
>
> On 04/05/2012 01:46, David Lewis wrote:
>
> Hi Moritz, guys,
> I added this progress-indicator data category to the requirements:
> http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#progress-indicator
>
> Regards,
> Dave
>
> On 28/04/2012 22:11, David Lewis wrote:
>
> Hi Morwitz,
> I moved this onto this separate thread related to the relevant 
> consolidation action.
>
> I think there are two different data categories here.
>
> What you describe is a progress indicator. This would be a common 
> feature on a lot of CMS-based and crowdsourced translation tools. It 
> would be measured as the number of segments (or perhaps words) of a 
> document (or a group of document representing a job) that have been 
> processes as a proportion of the total that need to be processed.
>
> The other, which is what the current text for 'process state' 
> (http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#process_state) 
> specifies, is an indication of  which point in a process sequence has 
> currently been reached. As discussed, this could be covered by the 
> processTrigger/readiness data category we are discussing.
>
> Moritz, does this distinction match with your view here? If so then we 
> could introduce a new 'progress-indicator' data category requirement, 
> and then continue discussing the consolidation of 'process state' with 
> processTrigger/readiness.
>
> thanks,
> Dave
>
>
> On 27/04/2012 18:40, Moritz Hellwig wrote:
>
> Hello,
>
> I might make this a separate thread, but since we are already talking 
> about processState here...
>
> There were quite a lot of requests from our editorial team to have 
> something like
>
> processIndicator
>
> Values integer, 0 to 100
>
> Zero would be "LSP process not begun"-ish, 100 would be "Completed".
>
> There are - from our point of view - considerable advantages:
>
> A) we can show a process progress indicator (in whichever visual 
> representation) that does not require an understanding of what the 
> actual process phase is on the MT side.
>
> B) the indicator can be agnostic to the number of processes / stages 
> on the side of the LSP. If you run a hundred separate processes or 
> feedback loops: fine by me.
>
> This would be beneficial for e.g. content creators who are unfamiliar 
> with the language technology, its processes and so on. Also, it would 
> allow us to built dashboards and generate reports e.g. to show and 
> sort by progression & keep better track of multilingual projects.
>
> Any thoughts?
>
> Cheers,
>
> Moritz
>
> Sent from my iPhone
>
>
> On 27.04.2012, at 01:14, "David Lewis" <dave.lewis@cs.tcd.ie 
> <mailto:dave.lewis@cs.tcd.ie>> wrote:
>
>> Pedro,
>> Yes, the redundancy of process state is one outcome of what I'm 
>> proposing here.
>>
>> The key difference is that the proposal is that the data category 
>> indicates the next process that should be performed, rather than 
>> indicating the current process in operation. The motivation is that 
>> the readiness to undergo a new process step is more useful to a 
>> document in a CMS, then knowing the current state that is operating 
>> on it.
>>
>> Complementary to this, provenance indicates that a process is 
>> completed, and associated with this records useful information needed 
>> to monitor correct or efficient process operation, perhaps as needed 
>> to monitor a service level agreement.
>>
>> Neither process trigger or provenance however actually aim to control 
>> process flow. This is a complex topic which therefore is probably out 
>> of scope.
>>
>> What we do need however, is a way of defining  the values to use for 
>> referencing processes, i.e. from both the 'request-process' and the 
>> process reference in provenance. For this we may want both a default 
>> set in the standard, and a way of unambiguously defining these for a 
>> particular business case. The key thing in any one case of 
>> interoperability is that the interoperating implementations exchange 
>> and understand the _same set_ of process values.
>>
>> let keep the discussion going on the list,
>> Dave
>>
>> On 26/04/2012 15:29, Pedro L. Díez Orzas wrote:
>>
>> Hi David,
>>
>> I need to consider this more carefully.
>>
>> But, what I see is that *process state *is perhaps redundant 
>> with:proofreading state orrevision state, since these can be values 
>> ofprocess state: proofreaded, revised, reviewed, translated, localized...
>>
>> Best,
>>
>> Pedro
>>
>> ------------------------------------------------------------------------
>>
>> *De:*David Lewis [mailto:dave.lewis@cs.tcd.ie]
>> *Enviado el:* jueves, 26 de abril de 2012 1:52
>> *Para:* public-multilingualweb-lt@w3.org 
>> <mailto:public-multilingualweb-lt@w3.org>
>> *Asunto:* Re: [all] Discussion on proposed metadata categories: 
>> approvalStatus
>>
>> Hi Moritz,
>> I think you make a very good general point here. It may be a bit too 
>> open ended to specify data categories that hardwire the completion of 
>> a specific step. We would run into the same issues we have with 
>> defining the different process values as we discussed around process 
>> trigger. Also, its not clear to me that all status flag suggestion 
>> for current steps, e.g. legal approval, really need to be separated 
>> from other steps.
>>
>> I think therefore we could generalise this as part of the process 
>> trigger data category as you suggest. This could allow us to 
>> consolidate *approvalStatus*, *cacheStatus*,*legalStaus*, 
>> *proofReading state* and *revision state* (and delegate the 
>> definition of these steps to data values rather than individual data 
>> categories). We can address *cacheStatus*, and at he same time 
>> generalise it to other processes than just translation, by including 
>> the time stamp and a revision flag.
>>
>> Also, I think the priority data category should be included here, as 
>> translation could consist of many different processes in combination, 
>> so it semantics are dependent on which one. At the same time we may 
>> also be interested in defining priorities even for non translation 
>> activities, such as review.
>>
>> *requested-process* (which has the name of the next process requested)
>>
>> *process-ref *(which may allow us to point to an external set of 
>> process definitions used for processRequested if the default value 
>> set is not used)
>>
>> *ready-at* (defines the time the content is ready for the process, it 
>> could be some time in the past, or some time in the future - this 
>> support part of the cacheStatus function)
>>
>> *revised* (yes/no - indicated is this is a different version of 
>> content that was previously marked as ready for the declared process)
>>
>> *priority* (I think for now we should keep this simple and just have 
>> values high/low )
>>
>> *complete-by* (provides a target date-time for completing the process)
>>
>> Any thoughts on this suggestion. Pedro, Ryan, Moritz, Des, I think 
>> this impacts on data categories you have an interest in.
>>
>> Also, DavidF, Pedro, Ryan, do you think this makes *process state* 
>> redundant? As a status flag are we more interested in what process to 
>> do next, rather than which one is finished. At the same time the 
>> provenance data category could tell us which processes have already 
>> finished operating on the content.
>>
>> cheers,
>> Dave
>>
>>
>> On 24/04/2012 11:11, Moritz Hellwig wrote:
>>
>> to identify publication process metadata which might also be relevant 
>> for the LSP. I ran into a couple of questions though.
>>
>> I'll use approvalStatus as an example (from the requirements document):
>>
>> >> approvalStatus
>>
>> >> Information about the status of the content in a formal approval 
>> workflow
>>
>> >> Indicates whether the content has been approved for release
>>
>> >> Possible values:
>>
>> >>>> yes
>>
>> >>>> no
>>
>> Approval can have many values which are rarely only "release yes|no" 
>> and they can be client/application-specific. However, none of these 
>> statuses seem to be relevant to the LSP, as they only precede or 
>> succeed the LSP's processes.
>>

Received on Tuesday, 8 May 2012 01:00:41 UTC