Re: [xliff] ITS scope with sm/em

Am 12.10.2014 um 19:50 schrieb Yves Savourel <ysavourel@enlaso.com>:

>> it looks like even without trying to apply ITS information 
>> the above cannot be transformed to hierarchical markup, 
>> because there is an overlap
> 
> Yes, that's a problem that may occurs relatively frequently in XLIFF because of the annotations and segmentation.
> 
> 
>> If the annotation tool creates an overlap like in your example,
>> you won't be able to generate hierarchical markup from this.
>> We pointed that out in the NIF2ITS section here http://www.w3.org/TR/its20/#nif-backconversion
>> see case 3. 
> 
> If it's a known limitation for NIF2ITS, I suppose it can be one for XLF2ITS as well.
> 
> It's interesting to note that the ITS specification has limitations that do not exist in NIF nor in other XML markup where
> overlapping issues are solved. It's a weakness of ITS that we may want to address in 3.0 someday.


Not sure … the limitation of not allowing for overlap in ITS is shared with general XML and HTML. The reason is that if you constrain yourself to hierarchical structures, hierarchy based queries become possible - like in CSS selectors or XPath - and simple processing like styling based on nesting.
NIF allows you for describing all kinds of relations, but you cannot query hierarchies in NIF.
I probably don’t know yet how the overlap issues is solved in XLIFF or other XML markup languages. I know of a typical set of solutions, see
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/NH.html
what does XLIFF do about this and how does XLIFF then deal with the „query overlap + hierarchies at the same time“ challenge?

> 
> 
>> Would it be possible to accommodate this in the global rules file,
>> by having a rule that selects elements based on the same attribute 
>> values? Ideally one would repeat m2 in your example and then 
>> select all "mrk" with the same "id" value. Though you can't 
>> repeat the id value of course.
> 
> I suppose you could match on the type. But another condition is that the nodes must be sequential.
> 
> So the complete transformation would be:
> 
> - change the XLIFF ITS module namespace to eith the ITS or the ITSXLF namespaces
> - change all <pc>/</pc> to <sc/>/<ec/>
> - change as many <sm/>/<em/> to mrk>/</mrk>
> - create global rules for the remaining <sm/>/<em/>


That looks like it indeed. The next step would be test input files and test output I guess? 

Cheers,

Felix


> 
> Cheers,
> -yves
> 

Received on Monday, 13 October 2014 07:01:32 UTC