- From: Felix Sasaki <fsasaki@w3.org>
- Date: Wed, 9 May 2012 08:46:40 +0200
- To: Pedro L. Díez Orzas <pedro.diez@linguaserve.com>
- Cc: David Lewis <dave.lewis@cs.tcd.ie>, public-multilingualweb-lt@w3.org
- Message-ID: <CAL58czq-BM97HN8CcN7mRNH8hQ_ncJyvu-9OBZfWtJO0zfr=tA@mail.gmail.com>
Pedro, all, I am wondering if this discussion could benefit from input of an HTTP expert. I have the feeling that the existing HTTP headers might be sufficient to realize this requirement. Do you mind if I take Yves Lafon http://www.w3.org/People/all#ylafon into the loop? Felix 2012/5/8 Pedro L. Díez Orzas <pedro.diez@linguaserve.com> > ** > > Dear Dave,**** > > ** ** > > First of all, thank you for the consolidation task, which is hard, complex > and “risky business” J.**** > > ** ** > > I would like to distinguish between cacheStatus and the rest. **** > > ** ** > > About this specific case of cache status, I probably now understand the > confusion. In you mail of the thread “Re: targetPointer Requirement > update”, mail 08/05/2012 13:49, you mention “*ii) a realtime translation > workflow, where content is put on a cache (I prefer perhaps a term like > 'staging server' to avoid confusion with 'web cache')”. *Instead, the > data category cacheStatus is not intended for the content in the *staging*or > *hidden* in the client server, but for the source/translated/both in the > side of the real time translation server. Actually, I did not considered > the *staging server *in this, and probably it should be done in the way > you suggest in your mail. Certainly the confusion was my fault when I > described as:**** > > ** ** > > - The original content is not saved in the cache (i.e., it is new or > has been updated): (re)translation is needed **** > - The translated content is not saved in the cache (i.e., it has not > been previously translated or has expired): translation is needed **** > - Neither the original nor the translated page are saved in the cache: > both need to be cached **** > > ** ** > > It refers not the client side or CMS, but to the Real Time Translation > System (RTTS) , which actually generates the web cache. For example, the > value for timestamp is not the client who put it, like in ready-at = <the > time at which it would be ready to cache>, but the RTTS when it does the > caching. In that respect, the client indicates in the final HTML web page > the values and whether a page or a part of a page needs to be cached or > not, and if source, target or both:**** > > ** ** > > - cached - values: yes, no; **** > - scope - values: source, target, both **** > - timestamp - date and time **** > > ** ** > > In this scenario, the source pages (or parts of pages) are always > translated in real time, and the translated pages (or parts) can be added > to the cache to speed up future accesses, but some pages not only does not > need to be cached, but needs not to be cached obligatory (for example pages > in private areas, transactional pages of an e-commerce process or a bank…). > **** > > ** ** > > I cannot tell 100% if *implementors who would implement the cacheStatus > are specifically only interested in that functionally and would be unlikely > to also implement a more general readiness data category*, but even If it > is 50% I would keep it separately, in the same way than other in > “Internationalization” section. It is really a multilingualWebCache > metadata in the pages for navigation of the final user. **** > > ** ** > > I hope this helps, and I will try to answer the rest before Thursday’s > meeting. **** > > ** ** > > Best,**** > > Pedro**** > > ** ** > > ** ** > ------------------------------ > > *De:* David Lewis [mailto:dave.lewis@cs.tcd.ie] > *Enviado el:* martes, 08 de mayo de 2012 3:00 > ***Para****:* "Pedro L. Díez Orzas" > *CC:* public-multilingualweb-lt@w3.org > > *Asunto:* Re: [ACTION-79]Consider consolidation of status-related data > categories and process trigger > **** > > ** ** > > Hi Pedro, > Sorry, I didn't yet fill in the details of how I thought this might work > for cache status, which would simply be:**** > > - The original content is not saved in the cache (i.e., it is new or > has been updated): (re)translation is needed **** > > the source document or element would have attribute:**** > > ready-to-process = cache-source > ready-at = <the time at which it would be ready to cache>**** > > - The translated content is not saved in the cache (i.e., it has not > been previously translated or has expired): translation is needed **** > > the translation document or element would have attributes:**** > > ready-to-process = cache-target > ready-at = <the time at which it would be ready to cache>**** > > - Neither the original nor the translated page are saved in the cache: > both need to be cached **** > > you could either have both the above, or in cases where the source and > target are in the same file use:**** > > ready-to-process = cache-source-and-target > ready-at = <the time at which it would be ready to cache>**** > > Note, there is a revised flag there that could also be used if useful > > So, if I understand this right I think the readiness attributes would > provide equivalent meta-data. However, if you think this is a distinct use > case, i.e. implementors who would implement the cacheStatus are > specifically only interested in that functionally and would be unlikely to > also implement a more general readiness data category, then definitely we > should be considering a separate data category. > > cheers, > Dave > > > On 07/05/2012 18:32, Pedro L. Díez Orzas wrote: **** > > Hi Dave,******** > > ** ****** > > I will look at it very carefully as soon as I can, since they are really > major changes, but a priori I do not understand why to consolidate and to > remove cacheStatus, since for me this is a completely different metadata > than processTrigger, processStatus or other “status” that answers > completely different requirements.******** > > ** ****** > > As I explained in the notes and definition of cacheStatus, this metadata > is not for localization chain o whatever localisation process, but for real > time translation systems and their caching needs. In this respect I would > put it again as it was (if you want it can called only “cache”, without > “status”) and sorry for any confusion I could produce about it.******** > > ** ****** > > Best,******** > > Pedro******** > > ******** > > *__________________________________***** > **** > > * ***** > > ***Pedro L. Díez Orzas******* > **** > > *****Presidente Ejecutivo/CEO***** > > *Linguaserve Internacionalización de Servicios, S.A.***** > > *Tel.: +34 91 761 64 60 > Fax: +34 91 542 89 28 ***** > > *E-mail: **pedro.diez@linguaserve.com***** > **** > > *www.linguaserve.com***** > > * ***** > > «En cumplimiento con lo previsto con los artículos 21 y 22 de la Ley > 34/2002, de 11 de julio, de Servicios de la Sociedad de **Info**rmación y > Comercio Electrónico, le informamos que procederemos al archivo y > tratamiento de sus datos exclusivamente con fines de promoción de los > productos y servicios ofrecidos por LINGUASERVE INTERNACIONALIZACIÓN DE > SERVICIOS, S.A. En caso de que Vdes. no deseen que procedamos al archivo y > tratamiento de los datos proporcionados, o no deseen recibir comunicaciones > comerciales sobre los productos y servicios ofrecidos, comuníquenoslo a > clients@linguaserve.com, y su petición será inmediatamente cumplida.»***** > *** > > ******** > > "According to the provisions set forth in articles 21 and 22 of Law > 34/2002 of July 11 regarding **Info**rmation Society and eCommerce > Services, we will store and use your personal data with the sole purpose of > marketing the products and services offered by LINGUASERVE > INTERNACIONALIZACIÓN DE SERVICIOS, S.A. If you do not wish your personal > data to be stored and handled, or you do not wish to receive further > information regarding products and services offered by our company, please > e-mail us to clients@linguaserve.com. Your request will be processed > immediately."******** > > *____________________________________***** > **** > > ** ****** > > ** ****** > ------------------------------ > > *De:* David Lewis [mailto:dave.lewis@cs.tcd.ie <dave.lewis@cs.tcd.ie>] > *Enviado el:* lunes, 07 de mayo de 2012 14:51 > *Para:* public-multilingualweb-lt@w3.org > *Asunto:* Re: [ACTION-79]Consider consolidation of status-related data > categories and process trigger**** > **** > > ** ****** > > Hi Pedro, Guys, > Following the previous discussion on the proposal for consolidation around > these data categories I have now made the following changes to the > requirements document. > > Pedro, as discussed on Friday's call could you and any other interested > parties examine these changes and flag anything issues on this thread. > > 1) I have update processTrigger and changed its name to 'readiness' as > previously discussed > > http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#readiness > > 2) I have moved the need for a process model to a new requirement to > reflect its relevance to several of the other data categories, including > readiness, progress-indicator and provenance, and it need for further > careful consideration: > > http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#Process_Model > > 3) As part of this consolidation I have removed the data categories of: > processTrigger, cacheStatus, legalStatus, processState, proofreadingState > and revision state > > 4) I've updated the data category tables and the related interests > accordingly > > 5) I've highlighted issues (in bold below) to consider about the following > properties of the removed processTrigger that are no longer present (as > recorded in the notes for the readiness data category)******** > > - *contentType*, values: MIME or custom values - This indicates the > format or the type of the content used in the content in order to apply the > right filter or normalization rules, and the subsequent processes. For > example, to express HTML we could use: “contentType: text/html: *consider > consolidation with formatType or languageResource* ******** > > >> Not agree, unless formatType refers really to computer format and not > like now: about the format or service for which the content is produced > (e.g., subtitles, spoken text)******** > > - *sourceLang* – value: standard ISO 639 value - this value indicates > the source language for the current translation requested. It is > different from the sourceLanguage (provenance) Data Category , since this > indicates the language the original source text was and sourceLang > indicates the current source language to be used for the translation that > can be different from the original source - *this should be considered > as an attribute for proveance* ******** > - *contentResultSource* –value: yes / no. Indicates the format if the > Localisation chain needs to give back the original - *is this > necessary as an attribute here or as a separate attribute* ******** > - *contentResultTarget* – value: monolingual, multilingual; indicates > if the resulting translation, in the cases of several target languages, > should be delivered in several monolingual content files or in a single > multilingual content file *this would require a more general purpose > return file indicator* ******** > - *pivotLang* - value: standard ISO value. Indicates the intermediate > language in the case is needed. Two examples: 1) Going from a source > language to two language variants (eg. into Brazil and Portugal > Portuguese), it is more cost-effective to go to one first (being this first > variant a "pivot" language) and to revise later to the second variant; > Going from one language to another via an intermediate language (eg. from > Maltese into English and from English into Irish, because there is not > direct Maltese into Irish available translation). - *consider > consolidation with source language, , i.e. it is an attibute of the source > language* ******** > > > Regards, > Dave > > On 04/05/2012 01:46, David Lewis wrote: ******** > > Hi Moritz, guys, > I added this progress-indicator data category to the requirements: > > http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#progress-indicator > > Regards, > Dave > > On 28/04/2012 22:11, David Lewis wrote: ******** > > Hi Morwitz, > I moved this onto this separate thread related to the relevant > consolidation action. > > I think there are two different data categories here. > > What you describe is a progress indicator. This would be a common feature > on a lot of CMS-based and crowdsourced translation tools. It would be > measured as the number of segments (or perhaps words) of a document (or a > group of document representing a job) that have been processes as a > proportion of the total that need to be processed. > > The other, which is what the current text for 'process state' ( > http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#process_state) > specifies, is an indication of which point in a process sequence has > currently been reached. As discussed, this could be covered by the > processTrigger/readiness data category we are discussing. > > Moritz, does this distinction match with your view here? If so then we > could introduce a new 'progress-indicator' data category requirement, and > then continue discussing the consolidation of 'process state' with > processTrigger/readiness. > > thanks, > Dave > > > On 27/04/2012 18:40, Moritz Hellwig wrote: ******** > > Hello,******** > > ** ****** > > I might make this a separate thread, but since we are already talking > about processState here...******** > > ** ****** > > There were quite a lot of requests from our editorial team to have > something like******** > > ** ****** > > processIndicator ******** > > Values integer, 0 to 100******** > > ** ****** > > Zero would be "LSP process not begun"-ish, 100 would be "Completed". ***** > *** > > ** ****** > > There are - from our point of view - considerable advantages:******** > > A) we can show a process progress indicator (in whichever visual > representation) that does not require an understanding of what the actual > process phase is on the MT side. ******** > > B) the indicator can be agnostic to the number of processes / stages on > the side of the LSP. If you run a hundred separate processes or feedback > loops: fine by me.******** > > ** ****** > > This would be beneficial for e.g. content creators who are unfamiliar with > the language technology, its processes and so on. Also, it would allow us > to built dashboards and generate reports e.g. to show and sort by > progression & keep better track of multilingual projects. ******** > > ** ****** > > Any thoughts?******** > > ** ****** > > Cheers,******** > > Moritz > > Sent from my iPhone******** > > > On 27.04.2012, at 01:14, "David Lewis" <dave.lewis@cs.tcd.ie> wrote:****** > ** > > Pedro, > Yes, the redundancy of process state is one outcome of what I'm proposing > here. > > The key difference is that the proposal is that the data category > indicates the next process that should be performed, rather than indicating > the current process in operation. The motivation is that the readiness to > undergo a new process step is more useful to a document in a CMS, then > knowing the current state that is operating on it. > > Complementary to this, provenance indicates that a process is completed, > and associated with this records useful information needed to monitor > correct or efficient process operation, perhaps as needed to monitor a > service level agreement. > > Neither process trigger or provenance however actually aim to control > process flow. This is a complex topic which therefore is probably out of > scope. > > What we do need however, is a way of defining the values to use for > referencing processes, i.e. from both the 'request-process' and the process > reference in provenance. For this we may want both a default set in the > standard, and a way of unambiguously defining these for a particular > business case. The key thing in any one case of interoperability is that > the interoperating implementations exchange and understand the _same set_ > of process values. > > let keep the discussion going on the list, > Dave > > On 26/04/2012 15:29, Pedro L. Díez Orzas wrote: ******** > > **Hi David,************** > > ** ********** > > I need to consider this more carefully. ************ > > ** ********** > > But, what I see is that *process state *is perhaps redundant with: proofreading > state or revision state, since these can be values of process state: > proofreaded, revised, reviewed, translated, localized…******** > **** > > ** ********** > > Best,************ > > Pedro************ > > ******** > **** > ------------------------------ > > *De:* David Lewis [mailto:dave.lewis@cs.tcd.ie <dave.lewis@cs.tcd.ie>] > *Enviado el:* jueves, 26 de abril de 2012 1:52 > *Para:* **public-multilingualweb-lt@w3.org** > *Asunto:* Re: [all] Discussion on proposed metadata categories: > approvalStatus******** > **** > > ** ********** > > Hi Moritz, > I think you make a very good general point here. It may be a bit too open > ended to specify data categories that hardwire the completion of a specific > step. We would run into the same issues we have with defining the different > process values as we discussed around process trigger. Also, its not clear > to me that all status flag suggestion for current steps, e.g. legal > approval, really need to be separated from other steps. > > I think therefore we could generalise this as part of the process trigger > data category as you suggest. This could allow us to consolidate * > approvalStatus*, *cacheStatus*,* legalStaus*, *proofReading state* and *revision > state* (and delegate the definition of these steps to data values rather > than individual data categories). We can address *cacheStatus*, and at he > same time generalise it to other processes than just translation, by > including the time stamp and a revision flag. > > Also, I think the priority data category should be included here, as > translation could consist of many different processes in combination, so it > semantics are dependent on which one. At the same time we may also be > interested in defining priorities even for non translation activities, such > as review. > > *requested-process* (which has the name of the next process requested) > > *process-ref *(which may allow us to point to an external set of process > definitions used for processRequested if the default value set is not used) > > *ready-at* (defines the time the content is ready for the process, it > could be some time in the past, or some time in the future - this support > part of the cacheStatus function) > > *revised* (yes/no - indicated is this is a different version of content > that was previously marked as ready for the declared process) > > *priority* (I think for now we should keep this simple and just have > values high/low ) > > *complete-by* (provides a target date-time for completing the process) > > Any thoughts on this suggestion. Pedro, Ryan, Moritz, Des, I think this > impacts on data categories you have an interest in. > > Also, DavidF, Pedro, Ryan, do you think this makes *process state*redundant? As a status flag are we more interested in what process to do > next, rather than which one is finished. At the same time the provenance > data category could tell us which processes have already finished operating > on the content. > > cheers, > Dave > > > On 24/04/2012 11:11, Moritz Hellwig wrote: ************ > > to identify publication process metadata which might also be relevant for > the LSP. I ran into a couple of questions though.**************** > > ** ************** > > I’ll use approvalStatus as an example (from the requirements document):*** > ************* > > >> approvalStatus **************** > > >> Information about the status of the content in a formal approval > workflow**************** > > >> Indicates whether the content has been approved for release *********** > ***** > > >> Possible values:**************** > > >>>> yes**************** > > >>>> no**************** > > ** ************** > > Approval can have many values which are rarely only “release yes|no” and > they can be client/application-specific. However, none of these statuses > seem to be relevant to the LSP, as they only precede or succeed the LSP’s > processes.************ > > ** ********** > > ** ****** > > ** ****** > > ** ****** > > ** ****** > > ** ** > -- Felix Sasaki DFKI / W3C Fellow
Received on Wednesday, 9 May 2012 06:47:09 UTC