Re: ITS mapping in XLIFF - annotation in mrk vs original inline codes

Yves, I see one big issue here. And this is at the XLIFF side.
There was a big pushback in the TC against making core inline elements
extensible.
We have succeeded in making mrk extensible based on the reasoning that it
is a generic annotation  vehicle unlike the other inlines.

Also I would concentrate on XLIFF 2.x mapping long term rather than XLIFF
1.2 and XLIFF 2.0 does have the provision for splitting markers..

This should be resolved real soon, not to impede Tilde..

Rgds
dF

Dr. David Filip
=======================
LRC | CNGL | LT-Web | CSIS
University of Limerick, Ireland
telephone: +353-6120-2781
*cellphone: +353-86-0222-158*
facsimile: +353-6120-2734
mailto: david.filip@ul.ie


On Tue, May 14, 2013 at 5:25 PM, Yves Savourel <ysavourel@enlaso.com> wrote:

> Hi Dave, David, all,
>
> (Posting on the IG list per
> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013May/0105.html
> )
>
> The more I'm testing XLIFF with ITS and the more I'm having doubts that
> always using mrk to hold the ITS annotations that exist in the original
> document is the best solution.
>
> For example, for the following HTML code
>
> <p>Exampe of user name: <span
> its:allowedCharacters="[a-zA-Z0-9]">Aldus123</span></p>
>
> we recommend:
>
> <source>Exampe of user name: <g id="1"><mrk
> its:allowedCharacters="[a-zA-Z0-9]"
> mtype="x-its">Aldus123</mrk></g></source>
>
> and not:
>
> <source>Exampe of user name: <g id="1"
> its:allowedCharacters="[a-zA-Z0-9]">Aldus123</g></source>
>
>
> Here are some reasons why the second solutions would be best:
>
> - It's shorter, simpler.
>
> - Having the annotation does not add elements in the content (better for
> TM matches for many tools)
>
> - It's a lot easier for tools to update the inline codes if the ITS markup
> is on it instead of on a separate element.
>
> - The mrk element in 1.2 has no way to work with overlapping codes, so if
> for example you segment at the middle of a mrk span, it's veru difficult to
> represent the resulting annotation(s).
>
> - Using directly the <g>/<bpt> element also fixes the potential accidental
> case when some text get inserted between the <g>/<bpt> and the <mrk>.
>
>
> Sure there are cases, like for comments, terms, where for compatibility
> using mrk may bring some advantages. But it seem minimal, and I wonder if
> those cases could be seen as exceptions rather than trying to apply a more
> cumbersome rule to all cases.
>
> Thoughts?
> -yves
>
>
>
>

Received on Wednesday, 15 May 2013 09:00:55 UTC