- From: Yves Savourel <ysavourel@enlaso.com>
- Date: Sun, 12 Oct 2014 06:31:46 -0600
- To: "'Estreen, Fredrik'" <Fredrik.Estreen@lionbridge.com>, "'Felix Sasaki'" <felix@sasakiatcf.com>
- CC: "XLIFF Main List" <xliff@lists.oasis-open.org>, "'public-i18n-its-ig'" <public-i18n-its-ig@w3.org>
Hi Fredrik, all, > This can be solved by lowering the <pc> into an <sc/>,<ec/> pair. That is a good point for that example, and a solution that should work most of the time. But I believe we will have some cases at least of overlapping annotations. As an example, below is the result of two text analysis Web services that detected two entities: One "Port Metro Vancouver" and the other "City of Vancouver" based on the content "Port Metro of Vancouver City". So we end up with "Vancouver" being shared by the two--otherwise distinct--annotation spans. <sm id="m1" type="dbp:entity" ref="http://www.wikidata.org/wiki/Q1187234"/>Port Metro of <sm id="m2" type="oc:entity/City" value="City of Vancouver" ref="http://en.wikipedia.org/wiki/Vancouver"/>Vancouver<em startRef="m1"/> City</em startRef="m2"/> One of the annotations could be set to an <mrk>, but that would leave one as <sm/>/<em/>. And the point I was trying to make for Felix is that such annotation, unlike for a Translate data category for example, cannot be decomposed into several <mrk> because the ITS information (here it would some Text Analysis data), applies only to the complete span not its parts. In other words we cannot do: <mrk id="m1" type="dbp:entity" ref="http://www.wikidata.org/wiki/Q1187234">Port Metro of <mrk id="m2" type="oc:entity/City" value="City of Vancouver" ref="http://en.wikipedia.org/wiki/Vancouver">Vancouver</mrk></mrk><mrk id="m2bis" type="oc:entity/City" value="City of Vancouver" ref="http://en.wikipedia.org/wiki/Vancouver"/> City</mrk> because "City" should not be associated alone with the ITS data. Sure, a tool could detect that two consecutive <mrk> with the same ITS information should be seen as a single one, but that is not an ITS processing expectation. I'm not sure what transformation would resolve this problem. Cheers, -ys
Received on Sunday, 12 October 2014 12:32:15 UTC