ITS mapping in XLIFF - annotation in mrk vs original inline codes

Hi Dave, David, all,

(Posting on the IG list per http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013May/0105.html)

The more I'm testing XLIFF with ITS and the more I'm having doubts that always using mrk to hold the ITS annotations that exist in the original document is the best solution.

For example, for the following HTML code

<p>Exampe of user name: <span its:allowedCharacters="[a-zA-Z0-9]">Aldus123</span></p>

we recommend:

<source>Exampe of user name: <g id="1"><mrk its:allowedCharacters="[a-zA-Z0-9]" mtype="x-its">Aldus123</mrk></g></source>

and not:

<source>Exampe of user name: <g id="1" its:allowedCharacters="[a-zA-Z0-9]">Aldus123</g></source>


Here are some reasons why the second solutions would be best:

- It's shorter, simpler.

- Having the annotation does not add elements in the content (better for TM matches for many tools)

- It's a lot easier for tools to update the inline codes if the ITS markup is on it instead of on a separate element.

- The mrk element in 1.2 has no way to work with overlapping codes, so if for example you segment at the middle of a mrk span, it's veru difficult to represent the resulting annotation(s).

- Using directly the <g>/<bpt> element also fixes the potential accidental case when some text get inserted between the <g>/<bpt> and the <mrk>.


Sure there are cases, like for comments, terms, where for compatibility using mrk may bring some advantages. But it seem minimal, and I wonder if those cases could be seen as exceptions rather than trying to apply a more cumbersome rule to all cases. 

Thoughts?
-yves

Received on Tuesday, 14 May 2013 16:26:30 UTC