Re: ITS mapping in XLIFF - annotation in mrk vs original inline codes

Hi Yves,

One the face of it this seems sensible, though I'm not clear what we 
might loose, if anything, by missing the mtype mappings e.g. for phrase 
and term.

To clarify, would we still need to use ITS annotation with mrk in cases 
where:
a) mtype="seg" since this is inserted by the XLIFF extractor
b) the annotation was added after the XLIFF file was generated from the 
source, e.g. by an XLIFF conformant terminology tool, since <g> and 
<bpt> as I understand relate specifically to inline annotations in the 
source file, and thus a reference to the skeleton.
?

If (b) is indeed the case, then the mapping get more complex since one 
has to support both cases, g/bpt and mrk, but you would still avoid 
having them both together as in your example.

Thoughts?

I'll put this on the agenda for wednesday's MLW-LT call.

This also brings up the procedural issue of the status of these calls. 
If best practice is essentially IG business, do these calls become joint 
IG/MLW-LT WG calls somehow? I guess we could just point to the minutes 
on the MLW-LT wiki until the spec business is over then switch to 
recording the minutes on the IG wiki.

cheers,
Dave



On 14/05/2013 17:25, Yves Savourel wrote:
> Hi Dave, David, all,
>
> (Posting on the IG list per http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013May/0105.html)
>
> The more I'm testing XLIFF with ITS and the more I'm having doubts that always using mrk to hold the ITS annotations that exist in the original document is the best solution.
>
> For example, for the following HTML code
>
> <p>Exampe of user name: <span its:allowedCharacters="[a-zA-Z0-9]">Aldus123</span></p>
>
> we recommend:
>
> <source>Exampe of user name: <g id="1"><mrk its:allowedCharacters="[a-zA-Z0-9]" mtype="x-its">Aldus123</mrk></g></source>
>
> and not:
>
> <source>Exampe of user name: <g id="1" its:allowedCharacters="[a-zA-Z0-9]">Aldus123</g></source>
>
>
> Here are some reasons why the second solutions would be best:
>
> - It's shorter, simpler.
>
> - Having the annotation does not add elements in the content (better for TM matches for many tools)
>
> - It's a lot easier for tools to update the inline codes if the ITS markup is on it instead of on a separate element.
>
> - The mrk element in 1.2 has no way to work with overlapping codes, so if for example you segment at the middle of a mrk span, it's veru difficult to represent the resulting annotation(s).
>
> - Using directly the <g>/<bpt> element also fixes the potential accidental case when some text get inserted between the <g>/<bpt> and the <mrk>.
>
>
> Sure there are cases, like for comments, terms, where for compatibility using mrk may bring some advantages. But it seem minimal, and I wonder if those cases could be seen as exceptions rather than trying to apply a more cumbersome rule to all cases.
>
> Thoughts?
> -yves
>
>
>

Received on Tuesday, 14 May 2013 20:57:13 UTC