- From: Dave Lewis <dave.lewis@cs.tcd.ie>
- Date: Wed, 04 Jul 2012 03:02:39 +0100
- To: public-multilingualweb-lt@w3.org
- Message-ID: <4FF3A43F.2040507@cs.tcd.ie>
Hi, To follow up on this, here are specific replies to outstanding comments included in the readiness data category requirements: * /contentType/, values: MIME or custom values - This indicates the format or the type of the content used in the content in order to apply the right filter or normalization rules, and the subsequent processes. For example, to express HTML we could use: "contentType: text/html > the document MIME type of dublin core format (http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=elements#format) should address this - no need for a new data category * /sourceLang/ -- value: standard ISO 639 value - this value indicates the source language for the current translation requested. It is different from the sourceLanguage (provenance) Data Category , since this indicates the language the original source text was and sourceLang indicates the current source language to be used for the translation that can be different from the original source - *this should be considered as an attribute for proveance* > I can't see a case where, by definition, this won't be the same as indicated by xml:lang or its:langRule, so it seems superfluous * /contentResultSource/ --value: yes / no. Indicates the format if the Localisation chain needs to give back the original > a candidate transParamType rather than a readiness attribute * /contentResultTarget/ -- value: monolingual, multilingual; indicates if the resulting translation, in the cases of several target languages, should be delivered in several monolingual content files or in a single multilingual content file > a candidate transParamType rather than a readiness attribute * /pivotLang/ - value: standard ISO value. Indicates the intermediate language in the case is needed. Two examples: 1) Going from a source language to two language variants (eg. into Brazil and Portugal Portuguese), it is more cost-effective to go to one first (being this first variant a "pivot" language) and to revise later to the second variant; Going from one language to another via an intermediate language (eg. from Maltese into English and from English into Irish, because there is not direct Maltese into Irish available translation). > a candidate transParamType rather than a readiness attribute cheers, Dave On 04/07/2012 02:51, Dave Lewis wrote: > Hi all, > Prior to attempting a call for consensus on the readiness data > category I wanted to address some of the outstanding issues listed at: > http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#readiness > > These were touched on by Pedro in Dublin and merited further discussion. > > I will address each separately in another post, but my primary > observation is that many relate to instruction specifically for > translation to be carried out on the whole document. This is > consistent with the overlap with ISO/TS 11669 observed in comments . > Taking the ISO/TS 11669 related summary at: http://www.ttt.org/specs/ > these are essentially parameters to a translation job specification, > but also work in progress and perhaps something that will be difficult > to align with normatively in our timeframe (arle?). > > Therefore, I propose we do not include such translation job parameters > in readiness because: > > 1) readiness is use to signal the point in time when some content is > ready to be processed by a named process. It is agnostic to what that > process is, it could be e.g. named-entity-recognition. So including > _translation_ process specific parameters is a scope mismatch and > therefore overloads the intended semantics of the readiness data category. > > 2) Translation process parameters may be more appropriate as separate > data categories as they are more generally useful even when readiness > is _not_ used. > > This implies a new data cateogry for translation parameters. I don't > see many use cases for applying these at the local level (but please > shout if you see them). Therefore we could propose a global data > category transParam element that contains: > > A) a selector indicating part of the document to which the translation > parameters apply (often but not always"/html/body") > > B) a required transParamType > > C) a required transParamValue > > Where transParamType and transParamValue are given non-normative best > practice definitions, ideally then aligned with ISO/TS 11669 as it > matures. > e.g. > <its:rules > xmlns:its="http://www.w3.org/2005/11/its" version="2.0"> > <its:transParam selector="/html/body" transParamType="/contentResultSource/" transParamValue="yes"/> > <its:transParam selector="/html/body/diclaimer" transParamType="/pivotLang/" transParamValue="en"/> > </its:rules> > > The class of potential translation process parameters data categories > could include: > > a) sourceLang, contentResultSource, contentResultTarget, pivotLanguage > from the readiness comments > > b) the remaining project related data categories ; > http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#Project_Information > > c) the post-editing parameters suggested in: > http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jun/0050.html > > cheers, > Dave >
Received on Wednesday, 4 July 2012 02:03:07 UTC