[ACTION-79]Consider consolidation of status-related data categories and process trigger from David Lewis on 2012-04-28 (public-multilingualweb-lt@w3.org from April 2012)

From: David Lewis <dave.lewis@cs.tcd.ie>
Date: Sat, 28 Apr 2012 22:11:56 +0100
To: public-multilingualweb-lt@w3.org
Message-ID: <4F9C5D1C.7070000@cs.tcd.ie>
Hi Morwitz,
I moved this onto this separate thread related to the relevant 
consolidation action.

I think there are two different data categories here.

What you describe is a progress indicator. This would be a common 
feature on a lot of CMS-based and crowdsourced translation tools. It 
would be measured as the number of segments (or perhaps words) of a 
document (or a group of document representing a job) that have been 
processes as a proportion of the total that need to be processed.

The other, which is what the current text for 'process state' 
(http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#process_state) 
specifies, is an indication of  which point in a process sequence has 
currently been reached. As discussed, this could be covered by the 
processTrigger/readiness data category we are discussing.

Moritz, does this distinction match with your view here? If so then we 
could introduce a new 'progress-indicator' data category requirement, 
and then continue discussing the consolidation of 'process state' with 
processTrigger/readiness.

thanks,
Dave


On 27/04/2012 18:40, Moritz Hellwig wrote:
> Hello,
>
> I might make this a separate thread, but since we are already talking 
> about processState here...
>
> There were quite a lot of requests from our editorial team to have 
> something like
>
> processIndicator
> Values integer, 0 to 100
>
> Zero would be "LSP process not begun"-ish, 100 would be "Completed".
>
> There are - from our point of view - considerable advantages:
> A) we can show a process progress indicator (in whichever visual 
> representation) that does not require an understanding of what the 
> actual process phase is on the MT side.
> B) the indicator can be agnostic to the number of processes / stages 
> on the side of the LSP. If you run a hundred separate processes or 
> feedback loops: fine by me.
>
> This would be beneficial for e.g. content creators who are unfamiliar 
> with the language technology, its processes and so on. Also, it would 
> allow us to built dashboards and generate reports e.g. to show and 
> sort by progression & keep better track of multilingual projects.
>
> Any thoughts?
>
> Cheers,
> Moritz
>
> Sent from my iPhone
>
> On 27.04.2012, at 01:14, "David Lewis" <dave.lewis@cs.tcd.ie 
> <mailto:dave.lewis@cs.tcd.ie>> wrote:
>
>> Pedro,
>> Yes, the redundancy of process state is one outcome of what I'm 
>> proposing here.
>>
>> The key difference is that the proposal is that the data category 
>> indicates the next process that should be performed, rather than 
>> indicating the current process in operation. The motivation is that 
>> the readiness to undergo a new process step is more useful to a 
>> document in a CMS, then knowing the current state that is operating 
>> on it.
>>
>> Complementary to this, provenance indicates that a process is 
>> completed, and associated with this records useful information needed 
>> to monitor correct or efficient process operation, perhaps as needed 
>> to monitor a service level agreement.
>>
>> Neither process trigger or provenance however actually aim to control 
>> process flow. This is a complex topic which therefore is probably out 
>> of scope.
>>
>> What we do need however, is a way of defining  the values to use for 
>> referencing processes, i.e. from both the 'request-process' and the 
>> process reference in provenance. For this we may want both a default 
>> set in the standard, and a way of unambiguously defining these for a 
>> particular business case. The key thing in any one case of 
>> interoperability is that the interoperating implementations exchange 
>> and understand the _same set_ of process values.
>>
>> let keep the discussion going on the list,
>> Dave
>>
>> On 26/04/2012 15:29, Pedro L. Díez Orzas wrote:
>>>
>>> Hi David,
>>>
>>> I need to consider this more carefully.
>>>
>>> But, what I see is that *process state *is perhaps redundant 
>>> with:proofreading state orrevision state, since these can be values 
>>> ofprocess state: proofreaded, revised, reviewed, translated, localized…
>>>
>>> Best,
>>>
>>> Pedro
>>>
>>> ------------------------------------------------------------------------
>>>
>>> *De:*David Lewis [mailto:dave.lewis@cs.tcd.ie]
>>> *Enviado el:* jueves, 26 de abril de 2012 1:52
>>> *Para:* public-multilingualweb-lt@w3.org
>>> *Asunto:* Re: [all] Discussion on proposed metadata categories: 
>>> approvalStatus
>>>
>>> Hi Moritz,
>>> I think you make a very good general point here. It may be a bit too 
>>> open ended to specify data categories that hardwire the completion 
>>> of a specific step. We would run into the same issues we have with 
>>> defining the different process values as we discussed around process 
>>> trigger. Also, its not clear to me that all status flag suggestion 
>>> for current steps, e.g. legal approval, really need to be separated 
>>> from other steps.
>>>
>>> I think therefore we could generalise this as part of the process 
>>> trigger data category as you suggest. This could allow us to 
>>> consolidate *approvalStatus*, *cacheStatus*,*legalStaus*, 
>>> *proofReading state* and *revision state* (and delegate the 
>>> definition of these steps to data values rather than individual data 
>>> categories). We can address *cacheStatus*, and at he same time 
>>> generalise it to other processes than just translation, by including 
>>> the time stamp and a revision flag.
>>>
>>> Also, I think the priority data category should be included here, as 
>>> translation could consist of many different processes in 
>>> combination, so it semantics are dependent on which one. At the same 
>>> time we may also be interested in defining priorities even for non 
>>> translation activities, such as review.
>>>
>>> *requested-process* (which has the name of the next process requested)
>>>
>>> *process-ref *(which may allow us to point to an external set of 
>>> process definitions used for processRequested if the default value 
>>> set is not used)
>>>
>>> *ready-at* (defines the time the content is ready for the process, 
>>> it could be some time in the past, or some time in the future - this 
>>> support part of the cacheStatus function)
>>>
>>> *revised* (yes/no - indicated is this is a different version of 
>>> content that was previously marked as ready for the declared process)
>>>
>>> *priority* (I think for now we should keep this simple and just have 
>>> values high/low )
>>>
>>> *complete-by* (provides a target date-time for completing the process)
>>>
>>> Any thoughts on this suggestion. Pedro, Ryan, Moritz, Des, I think 
>>> this impacts on data categories you have an interest in.
>>>
>>> Also, DavidF, Pedro, Ryan, do you think this makes *process state* 
>>> redundant? As a status flag are we more interested in what process 
>>> to do next, rather than which one is finished. At the same time the 
>>> provenance data category could tell us which processes have already 
>>> finished operating on the content.
>>>
>>> cheers,
>>> Dave
>>>
>>>
>>> On 24/04/2012 11:11, Moritz Hellwig wrote:
>>>
>>> to identify publication process metadata which might also be 
>>> relevant for the LSP. I ran into a couple of questions though.
>>>
>>> I’ll use approvalStatus as an example (from the requirements document):
>>>
>>> >> approvalStatus
>>>
>>> >> Information about the status of the content in a formal approval 
>>> workflow
>>>
>>> >> Indicates whether the content has been approved for release
>>>
>>> >> Possible values:
>>>
>>> >>>> yes
>>>
>>> >>>> no
>>>
>>> Approval can have many values which are rarely only “release yes|no” 
>>> and they can be client/application-specific. However, none of these 
>>> statuses seem to be relevant to the LSP, as they only precede or 
>>> succeed the LSP’s processes.
>>>
>>
Received on Saturday, 28 April 2012 21:12:32 UTC