W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > July 2012

Re: [all] readiness and translation process parameters

From: Dave Lewis <dave.lewis@cs.tcd.ie>
Date: Wed, 04 Jul 2012 03:02:39 +0100
Message-ID: <4FF3A43F.2040507@cs.tcd.ie>
To: public-multilingualweb-lt@w3.org
Hi,
To follow up on this, here are specific replies to outstanding comments 
included in the readiness data category requirements:

  * /contentType/, values: MIME or custom values - This indicates the
    format or the type of the content used in the content in order to
    apply the right filter or normalization rules, and the subsequent
    processes. For example, to express HTML we could use: "contentType:
    text/html

 >  the document MIME type of dublin core format 
(http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=elements#format) 
should address this - no need for a new data category

  * /sourceLang/ -- value: standard ISO 639 value - this value indicates
    the source language for the current translation requested. It is
    different from the sourceLanguage (provenance) Data Category , since
    this indicates the language the original source text was and
    sourceLang indicates the current source language to be used for the
    translation that can be different from the original source - *this
    should be considered as an attribute for proveance*

 > I can't see a case where, by definition, this won't be the same as 
indicated by xml:lang or its:langRule, so it seems superfluous

  * /contentResultSource/ --value: yes / no. Indicates the format if the
    Localisation chain needs to give back the original

 > a candidate transParamType rather than a readiness attribute

  * /contentResultTarget/ -- value: monolingual, multilingual; indicates
    if the resulting translation, in the cases of several target
    languages, should be delivered in several monolingual content files
    or in a single multilingual content file

 > a candidate transParamType rather than a readiness attribute

  * /pivotLang/ - value: standard ISO value. Indicates the intermediate
    language in the case is needed. Two examples: 1) Going from a source
    language to two language variants (eg. into Brazil and Portugal
    Portuguese), it is more cost-effective to go to one first (being
    this first variant a "pivot" language) and to revise later to the
    second variant; Going from one language to another via an
    intermediate language (eg. from Maltese into English and from
    English into Irish, because there is not direct Maltese into Irish
    available translation).

 > a candidate transParamType rather than a readiness attribute

cheers,
Dave

On 04/07/2012 02:51, Dave Lewis wrote:
> Hi all,
> Prior to attempting a call for consensus on the readiness data 
> category I wanted to address some of the outstanding issues listed at:
> http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#readiness
>
> These were touched on by Pedro in Dublin and merited further discussion.
>
> I will address each separately in another post, but my primary 
> observation is that many relate to instruction specifically for 
> translation to be carried out on the whole document. This is 
> consistent with the overlap with ISO/TS 11669 observed in comments . 
> Taking the ISO/TS 11669 related summary at: http://www.ttt.org/specs/
> these are essentially parameters to a translation job specification, 
> but also work in progress and perhaps something that will be difficult 
> to align with normatively in our timeframe (arle?).
>
> Therefore, I propose we do not include such translation job parameters 
> in readiness because:
>
> 1) readiness is use to signal the point in time when some content is 
> ready to be processed by a named process. It is agnostic to what that 
> process is, it could be e.g. named-entity-recognition. So including 
> _translation_ process specific parameters is a scope mismatch and 
> therefore overloads the intended semantics of the readiness data category.
>
> 2)  Translation process parameters may be more appropriate as separate 
> data categories as they are more generally useful even when readiness 
> is _not_ used.
>
> This implies a new data cateogry for translation parameters. I don't 
> see many use cases for applying these at the local level (but please 
> shout if you see them). Therefore we could propose a global data 
> category transParam element that contains:
>
> A) a selector indicating part of the document to which the translation 
> parameters apply  (often but not always"/html/body")
>
> B) a required transParamType
>
> C) a required transParamValue
>
> Where transParamType and transParamValue are given non-normative best 
> practice definitions, ideally then aligned with ISO/TS 11669 as it 
> matures.
> e.g.
> <its:rules
>    xmlns:its="http://www.w3.org/2005/11/its"   version="2.0">
>   <its:transParam selector="/html/body" transParamType="/contentResultSource/" transParamValue="yes"/>
>   <its:transParam selector="/html/body/diclaimer" transParamType="/pivotLang/" transParamValue="en"/>
> </its:rules>
>
> The class of potential translation process parameters data categories 
> could include:
>
> a) sourceLang, contentResultSource, contentResultTarget, pivotLanguage 
> from the readiness comments
>
> b) the remaining  project related data categories ;
> http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#Project_Information
>
> c) the post-editing parameters suggested in:
> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jun/0050.html
>
> cheers,
> Dave
>
Received on Wednesday, 4 July 2012 02:03:07 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:31:47 UTC