RE: ACTION-447: Make a batch transformation of the test suite to xliff from Yves Savourel on 2013-02-21 (public-multilingualweb-lt@w3.org from February 2013)

From: Yves Savourel <ysavourel@enlaso.com>
Date: Thu, 21 Feb 2013 05:32:58 -0700
To: "'Dave Lewis'" <dave.lewis@cs.tcd.ie>
CC: 'Mārcis Pinnis' <marcis.pinnis@Tilde.lv>, "'Multilingual Web LT Public List Public List'" <public-multilingualweb-lt@w3.org>
Message-ID: <003901ce102f$9fb7c010$df274030$@com>

Hi Dave,

> There's some indication in you discussion that localeFilter and translate 
> work in a similar way in the source to XLIFF mapping. My assumptionis that 
> locale filter is specifically there to prevent extraction of annotate text 
> when generating XLIFF for a specific target language - similar to what 
> you describe.

There is just a mention by me that currently our filter provides a result that is the same for translate='no' and a localeFilter value that is not in scope: inline content gets masked as code and structural content is not extracted.


> I'm not convinced that translate works in the same way since we may want 
> to include the content in the XLIFF as 'protected' source content to 
> provide context for the human translators (it can also provide content 
> to text analytics components called from the XLIFF process)
> We therefore need to nail down in best practice how exactly translate='no' 
> content should be processed in the mapping and extracting one, e.g. 
> if the _whole_ document is annotated in this way should it be extracted 
> at all, since its not really acting as 'context' in this way.

Actually I was thinking the reverse: it would make more sense to extract the content marked up with localeFilter and mark it with localFilter because this would allow you to extract once for all target locales and rely on the localeFilter hanlding to filter out the parts not relevant for a given locale after extraction.
... but at the same time, this would mean ITS support required by any XLIFF tool that looks at such document, and I don't think that is really a possible use case.

For translate='no': sure, have things extracted could be useful in some cases, in other it may be just make the extracted document a lot bigger for no good reason. In my opinion that almost something users should decide. I don't think we will be able to have a single mapping in many of the data categories.

cheers,
-yves

Received on Thursday, 21 February 2013 12:33:45 UTC