W3C home > Mailing lists > Public > public-multilingualweb-lt-comments@w3.org > February 2013

Re: [ISSUE-75] - Domain - 2.a. [ACTION-434]

From: Dr. David Filip <David.Filip@ul.ie>
Date: Wed, 27 Feb 2013 14:16:43 +0000
Message-ID: <CANw5LKnGf6OR8fWw6XkN4PNaYVyEmfEN_1gJPuHuxGTnq6vxZg@mail.gmail.com>
To: Yves Savourel <ysavourel@enlaso.com>, public-multilingualweb-lt@w3.org, Arle Lommel <arle.lommel@dfki.de>, Felix Sasaki <fsasaki@w3.org>, Jirka Kosek <jirka@kosek.cz>
Cc: public-multilingualweb-lt-comments@w3.org
Hi co-editors,

the note as formulated below by Christian has been OKed by all
stakeholders, now we are looking for a co-editor volunteer to
implement this into the spec in order to be able to close the issue.
I will create the editorial action for you to keep track if you volunteer :-)


Dr. David Filip
University of Limerick, Ireland
telephone: +353-6120-2781
cellphone: +353-86-0222-158
facsimile: +353-6120-2734
mailto: david.filip@ul.ie

On Wed, Feb 27, 2013 at 11:30 AM, Yves Savourel <ysavourel@enlaso.com> wrote:
> Hi David,
> The text looks fine to me.
> -yves
> -----Original Message-----
> From: Jörg Schütz [mailto:joerg@bioloom.de]
> Sent: Wednesday, February 27, 2013 4:18 AM
> To: public-multilingualweb-lt-comments@w3.org
> Subject: Re: [ISSUE-75] - Domain - 2.a. [ACTION-434]
> Hi David,
> I already gave my OK but here it is again.
> Cheers -- Jörg
> On Feb 27, 2013 at 12:10 (UTC+1), Dr. David Filip wrote:
>> Hi Christian, all,
>> we heard from Jan and Pablo that the text proposed by Christian to
>> resolve the Issue-75 works for them.
>> @Yves, @Jörg, I guess we need mainly the two of you to OK this to be
>> able close this one.
>> Rgds
>> dF
>> Dr. David Filip
>> =======================
>> LRC | CNGL | LT-Web | CSIS
>> University of Limerick, Ireland
>> telephone: +353-6120-2781
>> cellphone: +353-86-0222-158
>> facsimile: +353-6120-2734
>> mailto: david.filip@ul.ie
>> On Tue, Feb 5, 2013 at 1:35 PM, Lieske, Christian
>> <christian.lieske@sap.com> wrote:
>>> Hi,
>>> I had an action item to re-write the note related to "domainMapping" in "multi-engine" scenarios. Here is comes ...
>>> Cheers,
>>> Christian
>>> ==
>>> Although the focus of ITS 2.0, and some of the usage scenarios addressed in ITS 2.0 showcases (see http://www.w3.org/International/multilingualweb/lt/wiki/Use_cases_-_high_level_summary#ITS_2.0_Metadata:_Work-In-Context_Showcase) is on “single engine” environments, ITS 2.0 - for example in the context of the "domain" data category - can accommodate "workflow/multi engine" scenarios.
>>> Example:
>>> - A scenario involves Machine Translation (MT) engines A and B. The domain labels used by engine A follow the naming scheme A_123, the one for engine B follow the naming scheme B_456.
>>> - A "domainMapping" like the following is in place: domainMapping="'sports law' Legal, 'property law' Legal"
>>> - Engine A maps 'Legal' to A_4711, Engine B maps 'Legal' to B_42.
>>> Thus, ITS does not encode a process or workflow (like "Use MT engine A with domain A_4711, and use MT engine B with domain A_42"). Rather, it encodes information that can be used in workflows.
>>> -----Original Message-----
>>> From: Jörg Schütz [mailto:joerg@bioloom.de]
>>> Sent: Mittwoch, 30. Januar 2013 09:37
>>> To: public-multilingualweb-lt-comments@w3.org
>>> Subject: Re: [ISSUE-75] - Domain - 2.a. incl. 2.b. and 1.
>>> Hi Felix and all,
>>> Here is my suggestion for a note (native speakers please correct):
>>> Bear in mind that ITS is first and foremost a powerful markup
>>> technology to add metadata to (Web) content. In this sense, it is not
>>> a (direct) means to support, or even drive process or workflow
>>> engines, although some of the data categories like provenance,
>>> domain, domain mapping, etc. may induce such a view. Since this ITS
>>> metadata enhances the content in a structured way and in multiple
>>> forms, ITS consuming agents can employ that data to effectively
>>> implement their usage or deployment scenarios within single engine or
>>> single process environments as well as within multi-engine
>>> environments such as "try MT engine A, then MT engine B, ..." (see
>>> also ITS 2.0 showcases at http://www.w3.org/International/multilingualweb/lt/wiki/Use_cases_-_high_level_summary#ITS_2.0_Metadata:_Work-In-Context_Showcase).
>>> It is, however, not possible to assign, say, a specific domain
>>> mapping incarnation to a certain (process or workflow) instance
>>> because such an assignment concerns the process side, and this is
>>> beyond the current ITS metadata scope.
>>> With this, we now have apparently reached consensus on 2.a., 2.b.
>>> (already reviewed by Christian), and 1. (shepherd's view...)
>>> [@Yves: 1. is independent of the domain mapping specs.]
>>> Cheers -- Jörg
>>> On Jan 29, 2013, at 18:15 (CET), Felix Sasaki wrote:
>>>> Hi Jan, all,
>>>> thanks a lot for the initial note, Christian, and for comments in
>>>> this thread. It seems that Yves made clear that
>>>> “try MT engine A, then MT engine B”
>>>> may indeed work with the ITS domain mechanism - but there is a lot
>>>> of white spaces including
>>>> “try MT engine A with domain ‘financials’, then try MT engine B with
>>>> domain ‘healthcare’”
>>>> and layering of many other processing types. So maybe a final note
>>>> could concentrate on these white spaces? Anybody volunteering to
>>>> re-write the note?
>>>> Best,
>>>> Felix
>>>> Am 29.01.13 17:15, schrieb Jan Nelson:
>>>>> I find it a reasonable practice to define what is not in scope as a
>>>>> part of any specification, though agree that clear statements of in
>>>>> scope features are crucial.
>>>>> I am curious about how a multi-engine selection/validation process
>>>>> works.  Christian, you mentioned both TM services as well as MT
>>>>> engines.  I can see value to be able to call from a set of services
>>>>> depending on domain with fallback based on result quality scores.
>>>>> And you state that ITS 2.0 might be a single service scoped spec.
>>>>> Yves, you believe that there is support for more than one MT engine
>>>>> as currently spec'd.  My interest in the white spaces between the
>>>>> two comments are when layering n-services of differing processing
>>>>> types, e.g., fuzzy matching TM services versus statistical MT
>>>>> engine results and how that plays out.  It seems very ambitious to
>>>>> me to provide scope for this, and yet having a system that is
>>>>> capable of providing the kinds of metadata needed to enable it
>>>>> would be a pretty powerful in terms of the potential to provide hi-fi results.
>>>>> Maybe my comments are far out of scope, but the thread here caught
>>>>> my attention.  If this the case, I am happy to discuss it more
>>>>> offline, perhaps in Rome over a coffee.
>>>>> Jan
>>>>> ________________________________________
>>>>> From: Yves Savourel [ysavourel@enlaso.com]
>>>>> Sent: Tuesday, January 29, 2013 7:55 AM
>>>>> To: public-multilingualweb-lt-comments@w3.org
>>>>> Subject: RE: [ISSUE-75] - Domain - 2.a.
>>>>> Hi Christian, all,
>>>>> I’m always a bit uncomfortable with stating what a mechanism is NOT
>>>>> doing in a specification. It seems we should be able to define what
>>>>> it does do and that should be sufficient.
>>>>> I would also argue that the scenario “try MT engine A, then MT
>>>>> engine B” can work perfectly well with what we have today. The
>>>>> specification provides domainMapping for some basic mappings that
>>>>> allow for example to point multiple keywords to a more common unique 'domain' label.
>>>>> For example you have a mapping as this: domainMapping="'sports law'
>>>>> Legal, 'property law' Legal"
>>>>> and two MT engines: they each have a user-defined table that
>>>>> provide additional re-direction (they are even possibly pair
>>>>> specific: one maps 'Legal' to 'LEGAL_EN_PT' and the other maps
>>>>> 'Legal' to '5242e0762354527_legal'.
>>>>> Using domainMapping for more than simple grouping is bound to have
>>>>> quick limitations:
>>>>> a) what if you add a third MT engine? You have to edit every single
>>>>> rules document to add the new mapping?
>>>>> b) how do you map to engine that are defined per pair?
>>>>> IMO the mapping to the values used to slect the MT engine belongs
>>>>> to the process side, not the input.
>>>>> cheers,
>>>>> -yves
>>>>> From: Lieske, Christian [mailto:christian.lieske@sap.com]
>>>>> Sent: Tuesday, January 29, 2013 8:11 AM
>>>>> To: public-multilingualweb-lt-comments@w3.org
>>>>> Subject: [ISSUE-75] - Domain - 2.a.
>>>>> Hi,
>>>>> One of my comments related to “domain” (see
>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt-comme
>>>>> nts/2013Jan/0022.html)
>>>>> was the following:
>>>>> 2.a. Domain "systems" may not be harmonized across a processing chain.
>>>>> A Translation Memory component may for example work with different
>>>>> domains than a Machine Translation system that is part of the same
>>>>> processing chain. Since ITS 2.0 "domain" currently does not allow
>>>>> to capture the information "This is for component X" these
>>>>> scenarios cannot be addressed.
>>>>> During the face-to-face in Prague, we achieved the following status
>>>>> (see http://www.w3.org/2013/01/23-mlw-lt-minutes.html#item09): a
>>>>> note should explain that “domain” (and possibly other data
>>>>> categories) do not accommodate what could be called multi-engine scenario.
>>>>> Here is my suggestion for a note …
>>>>> The focus of ITS 2.0, and some of the usage scenarios addressed in
>>>>> ITS
>>>>> 2.0 showcases (see
>>>>> http://www.w3.org/International/multilingualweb/lt/wiki/Use_cases_-
>>>>> _high_level_summary#ITS_2.0_Metadata:_Work-In-Context_Showcase)
>>>>> is on “single engine” environments. Example: the Machine
>>>>> Translation
>>>>> (MT) usage scenarios do not work along the lines of process chains
>>>>> such as “try MT engine A, then MT engine B”. Accordingly, ITS 2.0
>>>>> has few provisions to support this kind of “multi-engine”
>>>>> environments which for example require domain-related information
>>>>> such as “try MT engine A with domain ‘financials’, then try MT
>>>>> engine B with domain ‘healthcare’”.
>>>>> Cheers,
>>>>> Christian
Received on Wednesday, 27 February 2013 14:17:55 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:32:27 UTC