- From: Dave Lewis <dave.lewis@cs.tcd.ie>
- Date: Thu, 10 May 2012 01:33:14 +0100
- To: "Pedro L. Díez Orzas" <pedro.diez@linguaserve.com>
- CC: 'Arle Lommel' <arle.lommel@dfki.de>, 'Felix Sasaki' <fsasaki@w3.org>, public-multilingualweb-lt@w3.org
- Message-ID: <4FAB0CCA.8090208@cs.tcd.ie>
Hi Pedro, Yes, I think that I misunderstood you intention with this data category - my apologies. I was thinking of it as a client side cache, just to help consistent version management on the client side. But if I understand this correctly, you are talking about a cache associated with the MT service, which is useful so that the same string from a source document doesn't need be translated again by the MT engine, just looked up from the cache - is that right? I'm of course very happy to discuss this further to make sure we address the use case correctly. cheers, Dave On 09/05/2012 10:53, Pedro L. Díez Orzas wrote: > > Dear Felix, Arle, Dave, all, > > Of course, any expert contribution is always welcome, so go ahead if > you think it is helpful. > > Nevertheless, I completely agree with Arle and again, as far as we > know, the HTTP headers provides information about cache in the client > server side (so it could be used for the case that Dave pointed out > about /staging server/ in the client side), but it does not provides > metadata to indicate to an external real time translation _and > publication_ system whether it has to cache a certain web content > after being translated and published in real time or a part of the > page. This is a content metadata is to be used and to cache by an > external system from the client server at several content levels. > > In any case, we are already using a lot of time for this, when this > metadata is thought for emerging technologies that still not many > people use (but we are convinced will do) , and that we can directly > manage with our clients (we are actually doing it with real life > clients). > > I consider more profitable to go ahead with discussions about other > data category (like processTrigger or readiness, or other) that are > much more extended (localization chains, for instance), so if you > think this "cacheStatus" (which category name is probably not the > best, and it should be more something that express "indicator from > clients whether the web content have to be cached by real time > translation and publication systems") it is not clear enough (I think > I already explained enough our position about this) let's drop it. > > Best, > > Pedro > > ------------------------------------------------------------------------ > > *De:*Arle Lommel [mailto:arle.lommel@dfki.de] > *Enviado el:* miércoles, 09 de mayo de 2012 10:28 > *Para:* Felix Sasaki > *CC:* Pedro L. Díez Orzas; David Lewis; public-multilingualweb-lt@w3.org > *Asunto:* Re: [ACTION-79]Consider consolidation of status-related data > categories and process trigger > > Hi Felix, > > I think that with the intended scenario Pedro proposed the HTTP > headers would not be granular enough. The cacheStatus could apply as > far down as the segment level, although the more likely scenario is > for it to apply at either the document level or the equivalent of the > DITA topic level. Since a web page could potentially pull multiple > topics into one place, the document itself would have a mix of cache > statuses depending on the cache status of the objects it references. > Perhaps Pedro can clarify, but even if that is the case, I don't think > it would hurt for Yves to get involved, so I'd say to go for bringing > him in. > > -Arle > > Sic scripsit Felix Sasaki in May 9, 2012 ad 08:46 : > > > > Pedro, all, > > I am wondering if this discussion could benefit from input of an HTTP > expert. I have the feeling that the existing HTTP headers might be > sufficient to realize this requirement. Do you mind if I take Yves Lafon > > http://www.w3.org/People/all#ylafon > > into the loop? > > Felix > > 2012/5/8 Pedro L. Díez Orzas <pedro.diez@linguaserve.com > <mailto:pedro.diez@linguaserve.com>> > > Dear Dave, > > First of all, thank you for the consolidation task, which is hard, > complex and "risky business" J. > > I would like to distinguish between cacheStatus and the rest. > > About this specific case of cache status, I probably now understand > the confusion. In you mail of the thread "Re: targetPointer > Requirement update", mail 08/05/2012 13:49, you mention "/ii) a > realtime translation workflow, where content is put on a cache (I > prefer perhaps a term like 'staging server' to avoid confusion with > 'web cache')". /Instead, the data category cacheStatus is not intended > for the content in the /staging/ or /hidden/ in the client server, but > for the source/translated/both in the side of the real time > translation server. Actually, I did not considered the /staging server > /in this, and probably it should be done in the way you suggest in > your mail. Certainly the confusion was my fault when I described as: > > * The original content is not saved in the cache (i.e., it is new or > has been updated): (re)translation is needed > > * The translated content is not saved in the cache (i.e., it has not > been previously translated or has expired): translation is needed > > * Neither the original nor the translated page are saved in the > cache: both need to be cached > > It refers not the client side or CMS, but to the Real Time Translation > System (RTTS) , which actually generates the web cache. For example, > the value for timestamp is not the client who put it, like in ready-at > = <the time at which it would be ready to cache>, but the RTTS when it > does the caching. In that respect, the client indicates in the final > HTML web page the values and whether a page or a part of a page needs > to be cached or not, and if source, target or both: > > * cached - values: yes, no; > * scope - values: source, target, both > * timestamp - date and time > > In this scenario, the source pages (or parts of pages) are always > translated in real time, and the translated pages (or parts) can be > added to the cache to speed up future accesses, but some pages not > only does not need to be cached, but needs not to be cached obligatory > (for example pages in private areas, transactional pages of an > e-commerce process or a bank...). > > I cannot tell 100% if /implementors who would implement the > cacheStatus are specifically only interested in that functionally and > would be unlikely to also implement a more general readiness data > category/, but even If it is 50% I would keep it separately, in the > same way than other in "Internationalization" section. It is really a > multilingualWebCache metadata in the pages for navigation of the final > user. > > I hope this helps, and I will try to answer the rest before Thursday's > meeting. > > Best, > > Pedro > > ------------------------------------------------------------------------ > > *De:*David Lewis [mailto:dave.lewis@cs.tcd.ie > <mailto:dave.lewis@cs.tcd.ie>] > *Enviado el:* martes, 08 de mayo de 2012 3:00 > *Para**:* "Pedro L. Díez Orzas" > *CC:* public-multilingualweb-lt@w3.org > <mailto:public-multilingualweb-lt@w3.org> > > > *Asunto:* Re: [ACTION-79]Consider consolidation of status-related data > categories and process trigger > > Hi Pedro, > Sorry, I didn't yet fill in the details of how I thought this might > work for cache status, which would simply be: > > * The original content is not saved in the cache (i.e., it is new or > has been updated): (re)translation is needed > > the source document or element would have attribute: > > ready-to-process = cache-source > ready-at = <the time at which it would be ready to cache> > > * The translated content is not saved in the cache (i.e., it has not > been previously translated or has expired): translation is needed > > the translation document or element would have attributes: > > ready-to-process = cache-target > ready-at = <the time at which it would be ready to cache> > > * Neither the original nor the translated page are saved in the > cache: both need to be cached > > you could either have both the above, or in cases where the source and > target are in the same file use: > > ready-to-process = cache-source-and-target > ready-at = <the time at which it would be ready to cache> > > Note, there is a revised flag there that could also be used if useful > > So, if I understand this right I think the readiness attributes would > provide equivalent meta-data. However, if you think this is a distinct > use case, i.e. implementors who would implement the cacheStatus are > specifically only interested in that functionally and would be > unlikely to also implement a more general readiness data category, > then definitely we should be considering a separate data category. > > cheers, > Dave > > > On 07/05/2012 18:32, Pedro L. Díez Orzas wrote: > > Hi Dave, > > I will look at it very carefully as soon as I can, since they are > really major changes, but a priori I do not understand why to > consolidate and to remove cacheStatus, since for me this is a > completely different metadata than processTrigger, processStatus or > other "status" that answers completely different requirements. > > As I explained in the notes and definition of cacheStatus, this > metadata is not for localization chain o whatever localisation > process, but for real time translation systems and their caching > needs. In this respect I would put it again as it was (if you want it > can called only "cache", without "status") and sorry for any confusion > I could produce about it. > > Best, > > Pedro > > *__________________________________* > > ** > > *Pedro L. Díez Orzas* > > *Presidente Ejecutivo/CEO* > > *Linguaserve Internacionalización de Servicios, S.A.* > > *Tel.: +34 91 761 64 60 <tel:%2B34%2091%20761%2064%2060> > Fax: +34 91 542 89 28 <tel:%2B34%2091%20542%2089%2028> * > > *E-mail: **pedro.diez@linguaserve.com <mailto:pedro.diez@linguaserve.com>* > > *www.linguaserve.com <http://www.linguaserve.com/>* > > ** > > «En cumplimiento con lo previsto con los artículos 21 y 22 de la Ley > 34/2002, de 11 de julio, de Servicios de la Sociedad de Información y > Comercio Electrónico, le informamos que procederemos al archivo y > tratamiento de sus datos exclusivamente con fines de promoción de los > productos y servicios ofrecidos por LINGUASERVE INTERNACIONALIZACIÓN > DE SERVICIOS, S.A. En caso de que Vdes. no deseen que procedamos al > archivo y tratamiento de los datos proporcionados, o no deseen recibir > comunicaciones comerciales sobre los productos y servicios ofrecidos, > comuníquenoslo a clients@linguaserve.com > <mailto:clients@linguaserve.com>, y su petición será inmediatamente > cumplida.» > > "According to the provisions set forth in articles 21 and 22 of Law > 34/2002 of July 11 regarding Information Society and eCommerce > Services, we will store and use your personal data with the sole > purpose of marketing the products and services offered by LINGUASERVE > INTERNACIONALIZACIÓN DE SERVICIOS, S.A. If you do not wish your > personal data to be stored and handled, or you do not wish to receive > further information regarding products and services offered by our > company, please e-mail us to clients@linguaserve.com > <mailto:clients@linguaserve.com>. Your request will be processed > immediately." > > *____________________________________* > > ------------------------------------------------------------------------ > > *De:*David Lewis [mailto:dave.lewis@cs.tcd.ie] > *Enviado el:* lunes, 07 de mayo de 2012 14:51 > *Para:* public-multilingualweb-lt@w3.org > <mailto:public-multilingualweb-lt@w3.org> > *Asunto:* Re: [ACTION-79]Consider consolidation of status-related data > categories and process trigger > > Hi Pedro, Guys, > Following the previous discussion on the proposal for consolidation > around these data categories I have now made the following changes to > the requirements document. > > Pedro, as discussed on Friday's call could you and any other > interested parties examine these changes and flag anything issues on > this thread. > > 1) I have update processTrigger and changed its name to 'readiness' as > previously discussed > http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#readiness > > 2) I have moved the need for a process model to a new requirement to > reflect its relevance to several of the other data categories, > including readiness, progress-indicator and provenance, and it need > for further careful consideration: > http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#Process_Model > > 3) As part of this consolidation I have removed the data categories of: > processTrigger, cacheStatus, legalStatus, processState, > proofreadingState and revision state > > 4) I've updated the data category tables and the related interests > accordingly > > 5) I've highlighted issues (in bold below) to consider about the > following properties of the removed processTrigger that are no longer > present (as recorded in the notes for the readiness data category) > > * /contentType/, values: MIME or custom values - This indicates the > format or the type of the content used in the content in order to > apply the right filter or normalization rules, and the subsequent > processes. For example, to express HTML we could use: > "contentType: text/html: *consider consolidation with formatType > or languageResource* > > >> Not agree, unless formatType refers really to computer format and > not like now: about the format or service for which the content is > produced (e.g., subtitles, spoken text) > > * /sourceLang/-- value: standard ISO 639 value - this value > indicates the source language for the current translation > requested. It is different from the sourceLanguage (provenance) > Data Category , since this indicates the language the original > source text was and sourceLang indicates the current source > language to be used for the translation that can be different from > the original source - *this should be considered as an attribute > for proveance* > * /contentResultSource/ --value: yes / no. Indicates the format if > the Localisation chain needs to give back the original - *is this > necessary as an attribute here or as a separate attribute* > * /contentResultTarget/ -- value: monolingual, multilingual; > indicates if the resulting translation, in the cases of several > target languages, should be delivered in several monolingual > content files or in a single multilingual content file *this would > require a more general purpose return file indicator* > * /pivotLang/ - value: standard ISO value. Indicates the > intermediate language in the case is needed. Two examples: 1) > Going from a source language to two language variants (eg. into > Brazil and Portugal Portuguese), it is more cost-effective to go > to one first (being this first variant a "pivot" language) and to > revise later to the second variant; Going from one language to > another via an intermediate language (eg. from Maltese into > English and from English into Irish, because there is not direct > Maltese into Irish available translation). - *consider > consolidation with source language, , i.e. it is an attibute of > the source language* > > > Regards, > Dave > > On 04/05/2012 01:46, David Lewis wrote: > > Hi Moritz, guys, > I added this progress-indicator data category to the requirements: > http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#progress-indicator > > Regards, > Dave > > On 28/04/2012 22:11, David Lewis wrote: > > Hi Morwitz, > I moved this onto this separate thread related to the relevant > consolidation action. > > I think there are two different data categories here. > > What you describe is a progress indicator. This would be a common > feature on a lot of CMS-based and crowdsourced translation tools. It > would be measured as the number of segments (or perhaps words) of a > document (or a group of document representing a job) that have been > processes as a proportion of the total that need to be processed. > > The other, which is what the current text for 'process state' > (http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#process_state) > specifies, is an indication of which point in a process sequence has > currently been reached. As discussed, this could be covered by the > processTrigger/readiness data category we are discussing. > > Moritz, does this distinction match with your view here? If so then we > could introduce a new 'progress-indicator' data category requirement, > and then continue discussing the consolidation of 'process state' with > processTrigger/readiness. > > thanks, > Dave > > > On 27/04/2012 18:40, Moritz Hellwig wrote: > > Hello, > > I might make this a separate thread, but since we are already talking > about processState here... > > There were quite a lot of requests from our editorial team to have > something like > > processIndicator > > Values integer, 0 to 100 > > Zero would be "LSP process not begun"-ish, 100 would be "Completed". > > There are - from our point of view - considerable advantages: > > A) we can show a process progress indicator (in whichever visual > representation) that does not require an understanding of what the > actual process phase is on the MT side. > > B) the indicator can be agnostic to the number of processes / stages > on the side of the LSP. If you run a hundred separate processes or > feedback loops: fine by me. > > This would be beneficial for e.g. content creators who are unfamiliar > with the language technology, its processes and so on. Also, it would > allow us to built dashboards and generate reports e.g. to show and > sort by progression & keep better track of multilingual projects. > > Any thoughts? > > Cheers, > > Moritz > > Sent from my iPhone > > > On 27.04.2012, at 01:14, "David Lewis" <dave.lewis@cs.tcd.ie > <mailto:dave.lewis@cs.tcd.ie>> wrote: > >> Pedro, >> Yes, the redundancy of process state is one outcome of what I'm >> proposing here. >> >> The key difference is that the proposal is that the data category >> indicates the next process that should be performed, rather than >> indicating the current process in operation. The motivation is that >> the readiness to undergo a new process step is more useful to a >> document in a CMS, then knowing the current state that is operating >> on it. >> >> Complementary to this, provenance indicates that a process is >> completed, and associated with this records useful information needed >> to monitor correct or efficient process operation, perhaps as needed >> to monitor a service level agreement. >> >> Neither process trigger or provenance however actually aim to control >> process flow. This is a complex topic which therefore is probably out >> of scope. >> >> What we do need however, is a way of defining the values to use for >> referencing processes, i.e. from both the 'request-process' and the >> process reference in provenance. For this we may want both a default >> set in the standard, and a way of unambiguously defining these for a >> particular business case. The key thing in any one case of >> interoperability is that the interoperating implementations exchange >> and understand the _same set_ of process values. >> >> let keep the discussion going on the list, >> Dave >> >> On 26/04/2012 15:29, Pedro L. Díez Orzas wrote: >> >> Hi David, >> >> I need to consider this more carefully. >> >> But, what I see is that *process state *is perhaps redundant >> with:proofreading state orrevision state, since these can be values >> ofprocess state: proofreaded, revised, reviewed, translated, localized... >> >> Best, >> >> Pedro >> >> ------------------------------------------------------------------------ >> >> *De:*David Lewis [mailto:dave.lewis@cs.tcd.ie] >> *Enviado el:* jueves, 26 de abril de 2012 1:52 >> *Para:* public-multilingualweb-lt@w3.org >> <mailto:public-multilingualweb-lt@w3.org> >> *Asunto:* Re: [all] Discussion on proposed metadata categories: >> approvalStatus >> >> Hi Moritz, >> I think you make a very good general point here. It may be a bit too >> open ended to specify data categories that hardwire the completion of >> a specific step. We would run into the same issues we have with >> defining the different process values as we discussed around process >> trigger. Also, its not clear to me that all status flag suggestion >> for current steps, e.g. legal approval, really need to be separated >> from other steps. >> >> I think therefore we could generalise this as part of the process >> trigger data category as you suggest. This could allow us to >> consolidate *approvalStatus*, *cacheStatus*,*legalStaus*, >> *proofReading state* and *revision state* (and delegate the >> definition of these steps to data values rather than individual data >> categories). We can address *cacheStatus*, and at he same time >> generalise it to other processes than just translation, by including >> the time stamp and a revision flag. >> >> Also, I think the priority data category should be included here, as >> translation could consist of many different processes in combination, >> so it semantics are dependent on which one. At the same time we may >> also be interested in defining priorities even for non translation >> activities, such as review. >> >> *requested-process* (which has the name of the next process requested) >> >> *process-ref *(which may allow us to point to an external set of >> process definitions used for processRequested if the default value >> set is not used) >> >> *ready-at* (defines the time the content is ready for the process, it >> could be some time in the past, or some time in the future - this >> support part of the cacheStatus function) >> >> *revised* (yes/no - indicated is this is a different version of >> content that was previously marked as ready for the declared process) >> >> *priority* (I think for now we should keep this simple and just have >> values high/low ) >> >> *complete-by* (provides a target date-time for completing the process) >> >> Any thoughts on this suggestion. Pedro, Ryan, Moritz, Des, I think >> this impacts on data categories you have an interest in. >> >> Also, DavidF, Pedro, Ryan, do you think this makes *process state* >> redundant? As a status flag are we more interested in what process to >> do next, rather than which one is finished. At the same time the >> provenance data category could tell us which processes have already >> finished operating on the content. >> >> cheers, >> Dave >> >> >> On 24/04/2012 11:11, Moritz Hellwig wrote: >> >> to identify publication process metadata which might also be relevant >> for the LSP. I ran into a couple of questions though. >> >> I'll use approvalStatus as an example (from the requirements document): >> >> >> approvalStatus >> >> >> Information about the status of the content in a formal approval >> workflow >> >> >> Indicates whether the content has been approved for release >> >> >> Possible values: >> >> >>>> yes >> >> >>>> no >> >> Approval can have many values which are rarely only "release yes|no" >> and they can be client/application-specific. However, none of these >> statuses seem to be relevant to the LSP, as they only precede or >> succeed the LSP's processes. >> > > > -- > Felix Sasaki > > DFKI / W3C Fellow >
Received on Thursday, 10 May 2012 00:33:50 UTC