- From: Dave Lewis <dave.lewis@cs.tcd.ie>
- Date: Mon, 03 Sep 2012 13:25:45 +0100
- To: Multilingual Web LT Public List <public-multilingualweb-lt@w3.org>
- Message-ID: <5044A1C9.2060907@cs.tcd.ie>
Hi, Leroy and I have been discussing some examples for CMS-MT integration scenarios with Declan and Ankit. One issue that's come up was how to deal with quotations in a segment passed to MT. for example, take the segment (from wikipedia) "*To be or not to be*" is the opening phrase of a soliloquy <http://en.wikipedia.org/wiki/Soliloquy> in William Shakespeare <http://en.wikipedia.org/wiki/William_Shakespeare>'s play /Hamlet <http://en.wikipedia.org/wiki/Hamlet>/. as (simplified) mark-up <b>"To be or not to be"</b> is the opening phrase of a soliloquy in William Shakespeare's play <i>Hamlet</i>. With SMT, to retain the integrity of the quote, it may well be run through the MT engine separately from the rest of the segment (or perhaps even through a different engine trained specifically on shakespeare bi-text in this example). I'm not clear in this case how (or even if) 'element within text' would help, since <b>"To be or not to be"</b> is part of the flow, but it does affect how it would be translated (in that it would be subsegemented for SMT-based translation). It seems like a nested withinText value, e.g.: <b its:withinText="nested">"To be or not to be"</b> is the opening phrase of a soliloquy in William Shakespeare's play <i>Hamlet</i>. But this doesn't match the example of nested given, where the sub-element is a footnote that can be completely removed from the parent element. Any advice from the ITS1.0 experts on this? One other point about the wording of the definition, it starts saying: "The Elements Within Text data category reveals *_if_* and how an element affects the way text content behaves from a linguistic viewpoint." But if you take the "if" literally as a question, the sense of the value definitions seems inverted to me, i.e. 'yes' means the element _doesn't_ affect the way the text in the element is treated during translation. thanks in advance, Dave
Received on Monday, 3 September 2012 12:23:58 UTC