Re: [ISSUE-75] - Domain - 2.a. incl. 2.b. and 1. from Jörg Schütz on 2013-01-30 (public-multilingualweb-lt-comments@w3.org from January 2013)

From: Jörg Schütz <joerg@bioloom.de>
Date: Wed, 30 Jan 2013 09:37:23 +0100
To: public-multilingualweb-lt-comments@w3.org
Message-ID: <5108DBC3.7020802@bioloom.de>
Hi Felix and all,

Here is my suggestion for a note (native speakers please correct):

Bear in mind that ITS is first and foremost a powerful markup technology 
to add metadata to (Web) content. In this sense, it is not a (direct) 
means to support, or even drive process or workflow engines, although 
some of the data categories like provenance, domain, domain mapping, 
etc. may induce such a view. Since this ITS metadata enhances the 
content in a structured way and in multiple forms, ITS consuming agents 
can employ that data to effectively implement their usage or deployment 
scenarios within single engine or single process environments as well as 
within multi-engine environments such as "try MT engine A, then MT 
engine B, ..." (see also ITS 2.0 showcases at 
http://www.w3.org/International/multilingualweb/lt/wiki/Use_cases_-_high_level_summary#ITS_2.0_Metadata:_Work-In-Context_Showcase). 
It is, however, not possible to assign, say, a specific domain mapping 
incarnation to a certain (process or workflow) instance because such an 
assignment concerns the process side, and this is beyond the current ITS 
metadata scope.

With this, we now have apparently reached consensus on 2.a., 2.b. 
(already reviewed by Christian), and 1. (shepherd's view...)

[@Yves: 1. is independent of the domain mapping specs.]

Cheers -- Jörg

On Jan 29, 2013, at 18:15 (CET), Felix Sasaki wrote:
> Hi Jan, all,
>
> thanks a lot for the initial note, Christian, and for comments in this
> thread. It seems that Yves made clear that
>
> “try MT engine A, then MT engine B”
>
> may indeed work with the ITS domain mechanism - but there is a lot of
> white spaces including
>
> “try MT engine A with domain ‘financials’, then try MT engine B with
> domain ‘healthcare’”
> and layering of many other processing types. So maybe a final note could
> concentrate on these white spaces? Anybody volunteering to re-write the
> note?
>
> Best,
>
> Felix
>
> Am 29.01.13 17:15, schrieb Jan Nelson:
>> I find it a reasonable practice to define what is not in scope as a
>> part of any specification, though agree that clear statements of in
>> scope features are crucial.
>>
>> I am curious about how a multi-engine selection/validation process
>> works.  Christian, you mentioned both TM services as well as MT
>> engines.  I can see value to be able to call from a set of services
>> depending on domain with fallback based on result quality scores.  And
>> you state that ITS 2.0 might be a single service scoped spec.
>>
>> Yves, you believe that there is support for more than one MT engine as
>> currently spec'd.  My interest in the white spaces between the two
>> comments are when layering n-services of differing processing types,
>> e.g., fuzzy matching TM services versus statistical MT engine results
>> and how that plays out.  It seems very ambitious to me to provide
>> scope for this, and yet having a system that is capable of providing
>> the kinds of metadata needed to enable it would be a pretty powerful
>> in terms of the potential to provide hi-fi results.
>>
>> Maybe my comments are far out of scope, but the thread here caught my
>> attention.  If this the case, I am happy to discuss it more offline,
>> perhaps in Rome over a coffee.
>>
>> Jan
>>
>> ________________________________________
>> From: Yves Savourel [ysavourel@enlaso.com]
>> Sent: Tuesday, January 29, 2013 7:55 AM
>> To: public-multilingualweb-lt-comments@w3.org
>> Subject: RE: [ISSUE-75] - Domain - 2.a.
>>
>> Hi Christian, all,
>>
>> I’m always a bit uncomfortable with stating what a mechanism is NOT
>> doing in a specification. It seems we should be able to define what it
>> does do and that should be sufficient.
>>
>> I would also argue that the scenario “try MT engine A, then MT engine
>> B” can work perfectly well with what we have today. The specification
>> provides domainMapping for some basic mappings that allow for example
>> to point multiple keywords to a more common unique 'domain' label.
>>
>> For example you have a mapping as this: domainMapping="'sports law'
>> Legal, 'property law' Legal"
>> and two MT engines: they each have a user-defined table that provide
>> additional re-direction (they are even possibly pair specific: one
>> maps 'Legal' to 'LEGAL_EN_PT' and the other maps 'Legal' to
>> '5242e0762354527_legal'.
>>
>> Using domainMapping for more than simple grouping is bound to have
>> quick limitations:
>>
>> a) what if you add a third MT engine? You have to edit every single
>> rules document to add the new mapping?
>>
>> b) how do you map to engine that are defined per pair?
>>
>> IMO the mapping to the values used to slect the MT engine belongs to
>> the process side, not the input.
>>
>> cheers,
>> -yves
>>
>>
>>
>> From: Lieske, Christian [mailto:christian.lieske@sap.com]
>> Sent: Tuesday, January 29, 2013 8:11 AM
>> To: public-multilingualweb-lt-comments@w3.org
>> Subject: [ISSUE-75] - Domain - 2.a.
>>
>> Hi,
>>
>> One of my comments related to “domain” (see
>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt-comments/2013Jan/0022.html)
>> was the following:
>>
>> 2.a. Domain "systems" may not be harmonized across a processing chain.
>> A Translation Memory component may for example work with different
>> domains than a Machine Translation system that is part of the same
>> processing chain. Since ITS 2.0 "domain" currently does not allow to
>> capture the information "This is for component X" these scenarios
>> cannot be addressed.
>>
>> During the face-to-face in Prague, we achieved the following status
>> (see http://www.w3.org/2013/01/23-mlw-lt-minutes.html#item09): a note
>> should explain that “domain” (and possibly other data categories) do
>> not accommodate what could be called multi-engine scenario.
>>
>> Here is my suggestion for a note …
>>
>> The focus of ITS 2.0, and some of the usage scenarios addressed in ITS
>> 2.0 showcases (see
>> http://www.w3.org/International/multilingualweb/lt/wiki/Use_cases_-_high_level_summary#ITS_2.0_Metadata:_Work-In-Context_Showcase)
>> is on “single engine” environments. Example: the Machine Translation
>> (MT) usage scenarios do not work along the lines of process chains
>> such as “try MT engine A, then MT engine B”. Accordingly, ITS 2.0 has
>> few provisions to support this kind of “multi-engine” environments
>> which for example require domain-related information such as “try MT
>> engine A with domain ‘financials’, then try MT engine B with domain
>> ‘healthcare’”.
>> Cheers,
>> Christian
Received on Wednesday, 30 January 2013 08:37:34 UTC