W3C home > Mailing lists > Public > public-i18n-its-ig@w3.org > November 2014

Re: ACTION-54: Try to come up with example of xliff+its test format / output

From: Felix Sasaki <fsasaki@w3.org>
Date: Thu, 6 Nov 2014 22:40:21 +0100
Cc: Yves Savourel <ysavourel@enlaso.com>, public-i18n-its-ig <public-i18n-its-ig@w3.org>
Message-Id: <2FCFD316-06C1-4BEB-8072-FD7E1922C06F@w3.org>
To: "Estreen, Fredrik" <Fredrik.Estreen@lionbridge.com>
HI Fredrik and Yves, all,

I would calculate the offset based on element textual content, zero is start of the element, tags themselves are not counted, and the whitespace is always stripped. Since roundtripping is not needed the whitespace stripping does not hurt.
See the NIF conversion at
http://www.w3.org/TR/its20/#conversion-to-nif
including the note about whitespace stripping.

Best,

Felix 

Am 06.11.2014 um 22:24 schrieb Estreen, Fredrik <Fredrik.Estreen@lionbridge.com>:

> Hi Yves, Felix,
> 
> How would this work in cases where xml:space != "preserve"? A generic XML processor might normalize the space and thus invalidate the offsets if insignificant whitespace is not preserved.
> 
> Regards,
> Fredrik Estreen
> 
>> -----Original Message-----
>> From: Yves Savourel [mailto:ysavourel@enlaso.com]
>> Sent: den 6 november 2014 15:34
>> To: 'Felix Sasaki'
>> Cc: 'public-i18n-its-ig'
>> Subject: RE: ACTION-54: Try to come up with example of xliff+its test format
>> / output
>> 
>> Hi Felix,
>> 
>> Can you specify a bit more how the offset would be computed?
>> It seems the zero is the start of the element (e.g. <source>) content.
>> But how would we count the inline element?
>> 
>> <source>Text<sm id='1' translate='no'/>data</source>
>> 
>> "Text" = 0,4
>> "data = 31,35
>> 
>> The problem is that we don't always know how long the inline tag is in the
>> document (you can have extra spaces between attributes, some attributes
>> with default values may be omitted, etc.)
>> 
>> Or should we count each inline tag as 1 character?
>> 
>> Which would give:
>> 
>> "Text" = 0,4
>> "data = 5,9
>> 
>> 
>> Thanks,
>> -yves
>> 
> 
Received on Thursday, 6 November 2014 21:40:55 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:11:31 UTC