Re: targetPointer Requirement update

(Are non-WG members allowed to comment here?  From the group description,
it seems so.  If not, apologies.) -

I would be reluctantly to implicitly encourage translation tools to treat
XLIFF as a raw XML file for translation, rather than what it actually is.
 However, Yves is right that this use case is also distressingly common in
proprietary/internal XML formats.  A typical workaround for the problem
involves preprocessing the content to copy the source text to the target,
then marking the target text only as translatable.  Adding something like
targetPointer to ITS would be a more elegant solution.

I can't speak to whether that falls within the scope of this group, however.

ct

On Mon, May 7, 2012 at 10:04 AM, David Lewis <dave.lewis@cs.tcd.ie> wrote:

> Hi Yves,
> If this is to deal with XLIFF and TMX file specifically, then I'm afraid I
> still don't understand the use case very well.
>
> Where there is already an element structure in the host document that
> indicates source and target content, what is the use case where the
> implementer wouldn't read the relevant XLIFF or TMX schema document to
> figure out how to parse this themselves. This seems simpler than defining a
> new standard tag in ITS to essentially explain the schema of XLIFF and TMX.
>
> Is there some class of useage of XLIFF and TMX that makes the
> interpretation of their source-target binding difficult to parse directly
> in practice?
>
> Also, consideration non-translation use cases such as semantic tagging or
> parallel text extraction , it doesn't seem likely that you'd do these
> without needing either to write to the file or understand say the
> distinction between translation and an alt-trans - in which case you'd need
> a working understanding of XLIFF/TMX anyway.
>
> cheers,
> Dave
>
>
>
> On 04/05/2012 16:00, Yves Savourel wrote:
>
>> Hi Dave, all,
>>
>>  However, this explanation does make me think the
>>> use case is out of scope for ITS 2.0.  because:
>>> ...
>>> ii) There are already standard file formats, i.e. XLIFF,
>>> TMX, that support this.
>>> ... not using these is not best practice.
>>> So it doesn't seem in scope to develop a standard
>>> solution for something that isn't best practice or
>>> in general is needed solely to support issues in
>>> proprietary formats.
>>>
>> XLIFF and TMX are precisely why something like targetPointer is needed.
>>
>> ITS is not meant just for translation. As applications like Tadej's
>> Enrycher show, one may want to perform other linguistic-related tasks (e.g.
>> semantic tagging, spell-checking, data-mining, alignment, creation of an
>> MT-generated TM, etc).
>>
>> I think we should expect any XML tool that ITS2-aware to be able to read
>> a document in format XYZ only knowing its ITS properties. It shouldn't need
>> to know about XLIFF or TMX to process XLIFF or TMX documents. Currently
>> such application cannot read an XLIFF document properly.
>>
>> Cheers,
>> -yves
>>
>>
>>
>>
>>
>>
>
>

Received on Wednesday, 9 May 2012 21:28:06 UTC