W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > February 2013

Issue-55: RE: ACTION-447: Make a batch transformation of the test suite to xliff

From: Yves Savourel <ysavourel@enlaso.com>
Date: Thu, 21 Feb 2013 05:50:02 -0700
To: "'Dave Lewis'" <dave.lewis@cs.tcd.ie>, "'Dr. David Filip'" <David.Filip@ul.ie>
CC: 'Mārcis Pinnis' <marcis.pinnis@tilde.lv>, "'Multilingual Web LT Public List Public List'" <public-multilingualweb-lt@w3.org>, "'Felix Sasaki'" <fsasaki@w3.org>
Message-ID: <003a01ce1031$f9551df0$ebff59d0$@com>
Hi Dave, all,

> Agreed -  thaks for those comment and feedback from Yves and Marcis. 
> We'll fix use of terminlogy and mrk in those xliff roundtrip files asap.

So what is the consensus for Terminology?

As far as I can tell it seems maybe there is some kind of agreement on:

- Always map term='yes' to <mrk mtype='term'>
- We can have other data categories in an <mrk mtype='term'>
- If a term is declared on a 'paragraph' we add an inline code in the XLIFF

But we have still no solutions for:

- If we just put the result of termInfoPointer, termInfoRef and termInfoRefPointer in comment and let the user deal with URI vs text: how can that be processed by an ITS-only tool? You cannot have a global rule like this:
<its:termRule selector="//mrk[@mtype='term']" term='yes'
 termInfoPointer="@comment" termInfoRefPointer="@comment"/>
The comment attribute can be either but not both.

- How do we feed back <mrk mtype='term' comment='text not URI'> into the original format when this is an added term?

- Do we have <mrk mtype='x-its-term-no'>? (and can we use a more compact notation?)

- Do we extract the info if we have its:term='no' termInfoRef='something' in the original, or just mark things at x-its-term-no'?

- etc.

thanks,
-yves


-----Original Message-----
From: Dave Lewis [mailto:dave.lewis@cs.tcd.ie] 
Sent: Thursday, February 21, 2013 3:48 AM
To: Dr. David Filip
Cc: Mārcis Pinnis; Yves Savourel; Multilingual Web LT Public List Public List; Felix Sasaki (fsasaki@w3.org)
Subject: Re: ACTION-447: Make a batch transformation of the test suite to xliff

Agreed -  thaks for those comment and feedback from Yves and Marcis. 
We'll fix use of terminlogy and mrk in those xliff roundtrip files asap.

Regards,
Dave



On 19/02/2013 13:27, Dr. David Filip wrote:
> Just one point for now..
> It was agreed and is documented here
> http://www.w3.org/International/multilingualweb/lt/wiki/XLIFF_Mapping
> That its term will be encoded in xliff using the native markup, not 
> its:term <mrk mtype='term'></mrk>
>
> Rgds
> dF
>
> Dr. David Filip
> =======================
> LRC | CNGL | LT-Web | CSIS
> University of Limerick, Ireland
> telephone: +353-6120-2781
> cellphone: +353-86-0222-158
> facsimile: +353-6120-2734
> mailto: david.filip@ul.ie
>
>
> On Tue, Feb 19, 2013 at 12:24 PM, Mārcis Pinnis <marcis.pinnis@tilde.lv> wrote:
>> Hi Yves, all,
>>
>> I had a look at the examples. I believe that either I am missing something (not understanding where the ITS 2.0 data is in the XLIFF documents) or there is some backwards compatibility of content lost when converting from the HTML/XML examples to XLIFF.
>>
>> 1. I had a look at the Terminology part and I could not find ITS 2.0 related terminology annotation in the XLIFF documents. I have attached my findings to this e-mail.
>>
>> 2. With the Locale Filter I see that instead of having ITS 2.0 mark-up, the whole fragment has been removed and replaced with a placeholder (is that because it is not possible to add Locale Filter mark-up in XLIFF at all?). This does not preserve the content, but filters out fragments based on ITS 2.0 consumption/production Use Case scenarios (which is I guess an internal process and not for data exchange purposes). And ... it actually does not show an XLIFF document with the Locale Filter data category metadata in it (that was what we wanted to see, but the examples, I believe do not show that). Is this because XLIFF would not be able to handle ITS 2.0 annotation or because of some other reasons (I am a bit confused here ... so I would like to clarify)?
>>
>> Some other findings (more in the attached file) 3. The Language 
>> Information as I understand it, will be fully passed on to xml:lang (that is clear).
>> 4. The Domain metadata seems to be transformed from ITS into an OKAPI internal structure.
>> 5. The Elements Within Text information as I understand it, is just structural, so no mark-up is necessary (that is clear).
>>
>> Maybe I have just misunderstood what the XLIFF examples would contain? I had the understanding that the transformation to XLIFF would preserve ITS 2.0 metadata. Did I understand it wrong?
>>
>> Then ... I had a look also at the files in the "roundtrip-example" directory. As I understand from Yves e-mail, these are not valid XLIFF files, right?!
>>
>> I still had a look at the examples that contained terminology annotation. I believe Terminology is used incorrectly:
>> <mrk its:terminology="yes" its:termInfoRef="#ge1">Arizona</mrk>
>> The attribute is its:term="yes" rather than terminology... (or am I 
>> again missing out some information?)
>>
>> The files seemed not to have Domain and LocaleFilter metadata in them - it would be great to see these categories in action as well.
>>
>> Best regards,
>> Mārcis ;o)
>>
>> -----Original Message-----
>> From: Yves Savourel [mailto:ysavourel@enlaso.com]
>> Sent: Monday, February 18, 2013 4:52 PM
>> To: 'Multilingual Web LT Public List Public List'
>> Subject: ACTION-447: Make a batch transformation of the test suite to 
>> xliff
>>
>> Hi all,
>>
>> I've done this action item.
>>
>> A batch file as well as the XLIFF output have been added to GitHub:
>> https://github.com/finnle/ITS-2.0-Testsuite/commit/294018ba576799dcbe
>> e7b9566da83837dd69f4ae
>>
>> Notes:
>>
>> -- The XLIFF outputs are often identical because the test files are just different ways to markup the same content.
>>
>> -- The XLIFF output often make little sense because the input exercises only one data category. For example, a storage size limitation set on a span ("inline") element will not show up on an inline element in XLIFF because there is no information in the input file that says the span element is 'within text' (since the test case is about the storage size). IHMO the output are rather useless.
>>
>> -- Most data categories have output, but only when the extraction use them. For example there is no output for directionality because, while the Okapi ITS engine process and provides that data category, the filter does nothing with it.
>>
>> Cheers,
>> -yves
>>
>>
>>
Received on Thursday, 21 February 2013 12:50:42 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 9 June 2013 00:25:08 UTC