Re: ITS mapping in XLIFF - annotation in mrk vs original inline codes

Hi all again,

Am 14.05.13 22:56, schrieb Dave Lewis:
> Hi Yves,
>
> One the face of it this seems sensible, though I'm not clear what we 
> might loose, if anything, by missing the mtype mappings e.g. for 
> phrase and term.
>
> To clarify, would we still need to use ITS annotation with mrk in 
> cases where:
> a) mtype="seg" since this is inserted by the XLIFF extractor
> b) the annotation was added after the XLIFF file was generated from 
> the source, e.g. by an XLIFF conformant terminology tool, since <g> 
> and <bpt> as I understand relate specifically to inline annotations in 
> the source file, and thus a reference to the skeleton.
> ?
>
> If (b) is indeed the case, then the mapping get more complex since one 
> has to support both cases, g/bpt and mrk, but you would still avoid 
> having them both together as in your example.
>
> Thoughts?
>
> I'll put this on the agenda for wednesday's MLW-LT call.

Looking at
http://www.w3.org/International/its/wiki/XLIFF_Mapping
it looks as if this is time critical for Tilde implementation of e.g. 
"language information" - so I encourage Mārcis to attend the call or if 
that doesn't work to state his opinion here.

Best,

Felix

>
> This also brings up the procedural issue of the status of these calls. 
> If best practice is essentially IG business, do these calls become 
> joint IG/MLW-LT WG calls somehow? I guess we could just point to the 
> minutes on the MLW-LT wiki until the spec business is over then switch 
> to recording the minutes on the IG wiki.
>
> cheers,
> Dave
>
>
>
> On 14/05/2013 17:25, Yves Savourel wrote:
>> Hi Dave, David, all,
>>
>> (Posting on the IG list per 
>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013May/0105.html)
>>
>> The more I'm testing XLIFF with ITS and the more I'm having doubts 
>> that always using mrk to hold the ITS annotations that exist in the 
>> original document is the best solution.
>>
>> For example, for the following HTML code
>>
>> <p>Exampe of user name: <span 
>> its:allowedCharacters="[a-zA-Z0-9]">Aldus123</span></p>
>>
>> we recommend:
>>
>> <source>Exampe of user name: <g id="1"><mrk 
>> its:allowedCharacters="[a-zA-Z0-9]" 
>> mtype="x-its">Aldus123</mrk></g></source>
>>
>> and not:
>>
>> <source>Exampe of user name: <g id="1" 
>> its:allowedCharacters="[a-zA-Z0-9]">Aldus123</g></source>
>>
>>
>> Here are some reasons why the second solutions would be best:
>>
>> - It's shorter, simpler.
>>
>> - Having the annotation does not add elements in the content (better 
>> for TM matches for many tools)
>>
>> - It's a lot easier for tools to update the inline codes if the ITS 
>> markup is on it instead of on a separate element.
>>
>> - The mrk element in 1.2 has no way to work with overlapping codes, 
>> so if for example you segment at the middle of a mrk span, it's veru 
>> difficult to represent the resulting annotation(s).
>>
>> - Using directly the <g>/<bpt> element also fixes the potential 
>> accidental case when some text get inserted between the <g>/<bpt> and 
>> the <mrk>.
>>
>>
>> Sure there are cases, like for comments, terms, where for 
>> compatibility using mrk may bring some advantages. But it seem 
>> minimal, and I wonder if those cases could be seen as exceptions 
>> rather than trying to apply a more cumbersome rule to all cases.
>>
>> Thoughts?
>> -yves
>>
>>
>>
>
>

Received on Wednesday, 15 May 2013 05:15:02 UTC