W3C home > Mailing lists > Public > public-multilingualweb-lt-comments@w3.org > February 2013

Re: [ISSUE-75] - Domain - 2.a. [ACTION-434]

From: Dr. David Filip <David.Filip@ul.ie>
Date: Wed, 27 Feb 2013 22:48:23 +0000
Message-ID: <CANw5LK=3G1qBBXZgDRkGX2d8DOtkYyHQFD4U8uRR=TYP1UntqA@mail.gmail.com>
To: Felix Sasaki <fsasaki@w3.org>, "Lieske, Christian" <christian.lieske@sap.com>
Cc: Yves Savourel <ysavourel@enlaso.com>, public-multilingualweb-lt@w3.org, Arle Lommel <arle.lommel@dfki.de>, Jirka Kosek <jirka@kosek.cz>, public-multilingualweb-lt-comments@w3.org
Thanks a lot Christian, very good to have more victims for editorial
actions, there is a host of them in queue now!
Cheers
dF

Dr. David Filip
=======================
LRC | CNGL | LT-Web | CSIS
University of Limerick, Ireland
telephone: +353-6120-2781
cellphone: +353-86-0222-158
facsimile: +353-6120-2734
mailto: david.filip@ul.ie


On Wed, Feb 27, 2013 at 2:30 PM, Felix Sasaki <fsasaki@w3.org> wrote:
> Hi David,
>
> just FYI, Christian (see CC) also volunteered to be a co-editor. So he would
> take over this editing item.
>
> Best,
>
> Felix
>
>
> Am 27.02.13 15:16, schrieb Dr. David Filip:
>
>> Hi co-editors,
>>
>> the note as formulated below by Christian has been OKed by all
>> stakeholders, now we are looking for a co-editor volunteer to
>> implement this into the spec in order to be able to close the issue.
>> I will create the editorial action for you to keep track if you volunteer
>> :-)
>>
>> Thanks
>> dF
>>
>> Dr. David Filip
>> =======================
>> LRC | CNGL | LT-Web | CSIS
>> University of Limerick, Ireland
>> telephone: +353-6120-2781
>> cellphone: +353-86-0222-158
>> facsimile: +353-6120-2734
>> mailto: david.filip@ul.ie
>>
>>
>> On Wed, Feb 27, 2013 at 11:30 AM, Yves Savourel <ysavourel@enlaso.com>
>> wrote:
>>>
>>> Hi David,
>>>
>>> The text looks fine to me.
>>>
>>> -yves
>>>
>>> -----Original Message-----
>>> From: Jörg Schütz [mailto:joerg@bioloom.de]
>>> Sent: Wednesday, February 27, 2013 4:18 AM
>>> To: public-multilingualweb-lt-comments@w3.org
>>> Subject: Re: [ISSUE-75] - Domain - 2.a. [ACTION-434]
>>>
>>> Hi David,
>>>
>>> I already gave my OK but here it is again.
>>>
>>> Cheers -- Jörg
>>>
>>> On Feb 27, 2013 at 12:10 (UTC+1), Dr. David Filip wrote:
>>>>
>>>> Hi Christian, all,
>>>>
>>>> we heard from Jan and Pablo that the text proposed by Christian to
>>>> resolve the Issue-75 works for them.
>>>> @Yves, @Jörg, I guess we need mainly the two of you to OK this to be
>>>> able close this one.
>>>>
>>>> Rgds
>>>> dF
>>>>
>>>> Dr. David Filip
>>>> =======================
>>>> LRC | CNGL | LT-Web | CSIS
>>>> University of Limerick, Ireland
>>>> telephone: +353-6120-2781
>>>> cellphone: +353-86-0222-158
>>>> facsimile: +353-6120-2734
>>>> mailto: david.filip@ul.ie
>>>>
>>>>
>>>> On Tue, Feb 5, 2013 at 1:35 PM, Lieske, Christian
>>>> <christian.lieske@sap.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I had an action item to re-write the note related to "domainMapping" in
>>>>> "multi-engine" scenarios. Here is comes ...
>>>>>
>>>>> Cheers,
>>>>> Christian
>>>>> ==
>>>>> Although the focus of ITS 2.0, and some of the usage scenarios
>>>>> addressed in ITS 2.0 showcases (see
>>>>> http://www.w3.org/International/multilingualweb/lt/wiki/Use_cases_-_high_level_summary#ITS_2.0_Metadata:_Work-In-Context_Showcase)
>>>>> is on “single engine” environments, ITS 2.0 - for example in the context of
>>>>> the "domain" data category - can accommodate "workflow/multi engine"
>>>>> scenarios.
>>>>>
>>>>> Example:
>>>>>
>>>>> - A scenario involves Machine Translation (MT) engines A and B. The
>>>>> domain labels used by engine A follow the naming scheme A_123, the one for
>>>>> engine B follow the naming scheme B_456.
>>>>> - A "domainMapping" like the following is in place:
>>>>> domainMapping="'sports law' Legal, 'property law' Legal"
>>>>> - Engine A maps 'Legal' to A_4711, Engine B maps 'Legal' to B_42.
>>>>>
>>>>> Thus, ITS does not encode a process or workflow (like "Use MT engine A
>>>>> with domain A_4711, and use MT engine B with domain A_42"). Rather, it
>>>>> encodes information that can be used in workflows.
>>>>> -----Original Message-----
>>>>> From: Jörg Schütz [mailto:joerg@bioloom.de]
>>>>> Sent: Mittwoch, 30. Januar 2013 09:37
>>>>> To: public-multilingualweb-lt-comments@w3.org
>>>>> Subject: Re: [ISSUE-75] - Domain - 2.a. incl. 2.b. and 1.
>>>>>
>>>>> Hi Felix and all,
>>>>>
>>>>> Here is my suggestion for a note (native speakers please correct):
>>>>>
>>>>> Bear in mind that ITS is first and foremost a powerful markup
>>>>> technology to add metadata to (Web) content. In this sense, it is not
>>>>> a (direct) means to support, or even drive process or workflow
>>>>> engines, although some of the data categories like provenance,
>>>>> domain, domain mapping, etc. may induce such a view. Since this ITS
>>>>> metadata enhances the content in a structured way and in multiple
>>>>> forms, ITS consuming agents can employ that data to effectively
>>>>> implement their usage or deployment scenarios within single engine or
>>>>> single process environments as well as within multi-engine
>>>>> environments such as "try MT engine A, then MT engine B, ..." (see
>>>>> also ITS 2.0 showcases at
>>>>> http://www.w3.org/International/multilingualweb/lt/wiki/Use_cases_-_high_level_summary#ITS_2.0_Metadata:_Work-In-Context_Showcase).
>>>>> It is, however, not possible to assign, say, a specific domain
>>>>> mapping incarnation to a certain (process or workflow) instance
>>>>> because such an assignment concerns the process side, and this is
>>>>> beyond the current ITS metadata scope.
>>>>>
>>>>> With this, we now have apparently reached consensus on 2.a., 2.b.
>>>>> (already reviewed by Christian), and 1. (shepherd's view...)
>>>>>
>>>>> [@Yves: 1. is independent of the domain mapping specs.]
>>>>>
>>>>> Cheers -- Jörg
>>>>>
>>>>> On Jan 29, 2013, at 18:15 (CET), Felix Sasaki wrote:
>>>>>>
>>>>>> Hi Jan, all,
>>>>>>
>>>>>> thanks a lot for the initial note, Christian, and for comments in
>>>>>> this thread. It seems that Yves made clear that
>>>>>>
>>>>>> “try MT engine A, then MT engine B”
>>>>>>
>>>>>> may indeed work with the ITS domain mechanism - but there is a lot
>>>>>> of white spaces including
>>>>>>
>>>>>> “try MT engine A with domain ‘financials’, then try MT engine B with
>>>>>> domain ‘healthcare’”
>>>>>> and layering of many other processing types. So maybe a final note
>>>>>> could concentrate on these white spaces? Anybody volunteering to
>>>>>> re-write the note?
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Felix
>>>>>>
>>>>>> Am 29.01.13 17:15, schrieb Jan Nelson:
>>>>>>>
>>>>>>> I find it a reasonable practice to define what is not in scope as a
>>>>>>> part of any specification, though agree that clear statements of in
>>>>>>> scope features are crucial.
>>>>>>>
>>>>>>> I am curious about how a multi-engine selection/validation process
>>>>>>> works.  Christian, you mentioned both TM services as well as MT
>>>>>>> engines.  I can see value to be able to call from a set of services
>>>>>>> depending on domain with fallback based on result quality scores.
>>>>>>> And you state that ITS 2.0 might be a single service scoped spec.
>>>>>>>
>>>>>>> Yves, you believe that there is support for more than one MT engine
>>>>>>> as currently spec'd.  My interest in the white spaces between the
>>>>>>> two comments are when layering n-services of differing processing
>>>>>>> types, e.g., fuzzy matching TM services versus statistical MT
>>>>>>> engine results and how that plays out.  It seems very ambitious to
>>>>>>> me to provide scope for this, and yet having a system that is
>>>>>>> capable of providing the kinds of metadata needed to enable it
>>>>>>> would be a pretty powerful in terms of the potential to provide hi-fi
>>>>>>> results.
>>>>>>>
>>>>>>> Maybe my comments are far out of scope, but the thread here caught
>>>>>>> my attention.  If this the case, I am happy to discuss it more
>>>>>>> offline, perhaps in Rome over a coffee.
>>>>>>>
>>>>>>> Jan
>>>>>>>
>>>>>>> ________________________________________
>>>>>>> From: Yves Savourel [ysavourel@enlaso.com]
>>>>>>> Sent: Tuesday, January 29, 2013 7:55 AM
>>>>>>> To: public-multilingualweb-lt-comments@w3.org
>>>>>>> Subject: RE: [ISSUE-75] - Domain - 2.a.
>>>>>>>
>>>>>>> Hi Christian, all,
>>>>>>>
>>>>>>> I’m always a bit uncomfortable with stating what a mechanism is NOT
>>>>>>> doing in a specification. It seems we should be able to define what
>>>>>>> it does do and that should be sufficient.
>>>>>>>
>>>>>>> I would also argue that the scenario “try MT engine A, then MT
>>>>>>> engine B” can work perfectly well with what we have today. The
>>>>>>> specification provides domainMapping for some basic mappings that
>>>>>>> allow for example to point multiple keywords to a more common unique
>>>>>>> 'domain' label.
>>>>>>>
>>>>>>> For example you have a mapping as this: domainMapping="'sports law'
>>>>>>> Legal, 'property law' Legal"
>>>>>>> and two MT engines: they each have a user-defined table that
>>>>>>> provide additional re-direction (they are even possibly pair
>>>>>>> specific: one maps 'Legal' to 'LEGAL_EN_PT' and the other maps
>>>>>>> 'Legal' to '5242e0762354527_legal'.
>>>>>>>
>>>>>>> Using domainMapping for more than simple grouping is bound to have
>>>>>>> quick limitations:
>>>>>>>
>>>>>>> a) what if you add a third MT engine? You have to edit every single
>>>>>>> rules document to add the new mapping?
>>>>>>>
>>>>>>> b) how do you map to engine that are defined per pair?
>>>>>>>
>>>>>>> IMO the mapping to the values used to slect the MT engine belongs
>>>>>>> to the process side, not the input.
>>>>>>>
>>>>>>> cheers,
>>>>>>> -yves
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> From: Lieske, Christian [mailto:christian.lieske@sap.com]
>>>>>>> Sent: Tuesday, January 29, 2013 8:11 AM
>>>>>>> To: public-multilingualweb-lt-comments@w3.org
>>>>>>> Subject: [ISSUE-75] - Domain - 2.a.
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> One of my comments related to “domain” (see
>>>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt-comme
>>>>>>> nts/2013Jan/0022.html)
>>>>>>> was the following:
>>>>>>>
>>>>>>> 2.a. Domain "systems" may not be harmonized across a processing
>>>>>>> chain.
>>>>>>> A Translation Memory component may for example work with different
>>>>>>> domains than a Machine Translation system that is part of the same
>>>>>>> processing chain. Since ITS 2.0 "domain" currently does not allow
>>>>>>> to capture the information "This is for component X" these
>>>>>>> scenarios cannot be addressed.
>>>>>>>
>>>>>>> During the face-to-face in Prague, we achieved the following status
>>>>>>> (see http://www.w3.org/2013/01/23-mlw-lt-minutes.html#item09): a
>>>>>>> note should explain that “domain” (and possibly other data
>>>>>>> categories) do not accommodate what could be called multi-engine
>>>>>>> scenario.
>>>>>>>
>>>>>>> Here is my suggestion for a note …
>>>>>>>
>>>>>>> The focus of ITS 2.0, and some of the usage scenarios addressed in
>>>>>>> ITS
>>>>>>> 2.0 showcases (see
>>>>>>> http://www.w3.org/International/multilingualweb/lt/wiki/Use_cases_-
>>>>>>> _high_level_summary#ITS_2.0_Metadata:_Work-In-Context_Showcase)
>>>>>>> is on “single engine” environments. Example: the Machine
>>>>>>> Translation
>>>>>>> (MT) usage scenarios do not work along the lines of process chains
>>>>>>> such as “try MT engine A, then MT engine B”. Accordingly, ITS 2.0
>>>>>>> has few provisions to support this kind of “multi-engine”
>>>>>>> environments which for example require domain-related information
>>>>>>> such as “try MT engine A with domain ‘financials’, then try MT
>>>>>>> engine B with domain ‘healthcare’”.
>>>>>>> Cheers,
>>>>>>> Christian
>>>
>>>
>>>
>
Received on Wednesday, 27 February 2013 22:49:30 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 27 February 2013 22:49:31 GMT