RE: [Agenda] ITS IG call 16 July noon utc from Serge Gladkoff on 2014-07-16 (public-i18n-its-ig@w3.org from July 2014)

From: Serge Gladkoff <serge.gladkoff@gmail.com>
Date: Wed, 16 Jul 2014 13:10:35 +0200
To: "'Felix Sasaki'" <felix.sasaki@dfki.de>, "Dave Lewis" <dave.lewis@cs.tcd.ie>
Cc: <public-i18n-its-ig@w3.org>
Message-ID: <05b101cfa0e6$923765a0$b6a630e0$@gmail.com>
Hello Felix, Dave and others,

 

I have made several proposed additions to the document, specifically:

 

A. Industry sector engagement issues.

 

1.       This document is focuses on solely the vision of "data annotation"
community, and this is very narrow from the point of view of other future
stakeholders, such as language technology and language service communities,
not to mention the public which is the intended audience of Public Automated
Translation Services, by the way.

I would therefore widen the "Interoperability Goals" with: "To ensure
presence of interoperability data and service features enabling full and
meaningful engagement of language service and language technology
communities, as well as necessary prerequisites for the professional and
public feedback and participation."



2.       I added the keyword "European Language Cloud" to the
Interoperability goals. I think this is quite important to mention European
Language Cloud here, especially as many proprietary clouds are now going to
emerge, we need to make sure they use these requirements as well for
interoperability purposes.




B. Quality Management issues.

The MT is completely meaningless without the reliable measure of whether it
is good or bad for intended purpose, and the output is actually usable. But
the aspect of LQA is underplayed in this document, I would seek to improve
this somewhat.

 

1.       I improved the definition of "standards" in this document J. I
think we should mention ITS2.0 as the standard in this document J, as well
as linked data standards developed by W3C. J



2.       The Reference Model graphics refers to "Trans QA" - there is no
such term in the industry; the industry uses the term LQA ("Linguistic
quality Assurance"), as "standard process box" so to speak to verify that
the language quality is up to the requirements. LQA is used as internal
process step to build various translation processes. I therefore added LQA
definition to the Terminology section and changing the name on the graphics,
accordingly.



3.       Since quality assessment is so important for production processes,
I would insist on changing the data management requirement 5 to M
(Mandatory): "It should be possible for third parties to submit error, QA or
corrective annotations to published data, provided it is presented in a
common format." Without this requirement to be mandatory recommendation the
Pandora box will be open for producing the content which is unusable,
without any correction or feedback mechanisms.



4.       Also, it is important that any error feedback is conformant to
common practices, so I would propose to amend the above requirement 5 as
follows: "It should be possible for third parties (general public,
individual experts and language service providers alike, as well as
automated language services) to submit error, QA or corrective annotations
to published data, provided it is presented in a common format, with
metadata conformant to one of the commonly accepted and documented universal
error typologies, and/or appropriate quality metrics.". Otherwise we will
get all types of error annotations that are incompatible and therefore
meaningless; we also absolutely need to provide a channel to the feedback
and annotation mechanisms of all industry stakeholders. This is required to
plug in various essential services and tools.



5.       Currently there are no public language quality assessment
methodologies, however, there's a work item currently in development in ASTM
that is targeted specifically at "Development of a complete methodology,
including a simplified quality metric, for crowd-sourced expert language
quality assessment targeted at nonprofit web sites and other documents of
public interest". I suggest at least mentioning it here in definition
section. I also invite participation of the group in development discussion
of this public standard which is intended precisely for the public content
quality assessment. 



C. Lack of stakeholders for this document to be published 

 

I think that we need to seek additional feedback and support from
communities, such as LT-Innovate and perhaps GALA. The reason is that any
proposed framework for machine translation must be supported by wider
audiences who should see their possible participation. I would propose to
launch some outreach effort so we get more qualified participation from the
industry sectors. What we could do is to launch an outreach seeking further
input. Such an outreach would be a method of influence in itself that this
group can engage in putting these ideas forward.

 

I am looking forward to our conference call.

 

Regards,

Serge Gladkoff

President, Logrus International

GALA CRISP Lead

 

 

 

From: Felix Sasaki [mailto:felix.sasaki@dfki.de] 
Sent: Wednesday, July 16, 2014 11:25 AM
To: Dave Lewis
Cc: public-i18n-its-ig@w3.org
Subject: Re: [Agenda] ITS IG call 16 July noon utc

 

Hi Dave,

 

Am 16.07.2014 um 11:02 schrieb Dave Lewis <dave.lewis@cs.tcd.ie>:





Hi Felix,
I probably won't be able to make this today.

Two things to note however:
i) David has negotiated having a one day, single track FEISGILTT with
LocWorld on 29th October 

 

 

It is a bit a pity that this overlaps with TPAC technical plenary day. So
I'll probably won't be able to join you. I'll try to prepare input to the
ITS topics.





in Vancouver, so we are now preparing Call for Papers. Topics we were
planning to cover are; 

*	TBX-RDF models/migration and integration with open
lexical-conceptual resources; 
*	new XLIFF2.0 modules, in particular for ITS;  
*	objection model/APIs for XLIFF/ITS
*	MQM-ITS-RDF integration;
*	publishing bitext  

I'll draft some CFP text for comment later today, but if people on the call
or the list have any further topics they would like to see addressed please
let me know. I guess its also a good opportunity to have some wider input
from the Microsoft guys as it will be an easy trip for them.

ii) Sadly, Leroy will be moving on from TCD on the 4th August to join IBM in
Dublin. I hope you will join me in saying a big thank you to him for all the
work he did at TCD in helping to develop ITS2.0 and in particular the
implementation of the Test Suite. 

 

 

Indeed! This sounds like a great opportunity for Leroy and congrats to that,
and indeed a big thank you for the work on the test suite. Without you we
would not have managed to move this forward the way needed - thanks a lot!
And as usual: let's stay in touch and keep us posted what you do - maybe we
can squeeze in some ITS, linked data - or both :)

 

Best,

 

Felix





Regards,
Dave
 



On 15/07/2014 19:07, Felix Sasaki wrote:

Your time 
http://www.timeanddate.com/worldclock/fixedtime.html?iso=20140716T12
 
Dial-in info
 
https://www.w3.org/International/its/wiki/Dial_in_info_for_regular_call
 
Please join IRC via your client or via
http://irc.w3.org <http://irc.w3.org/> 
Channel: #i18nits
 
 
Topics: the same as two weeks ago. We will go through this     quickly and
then see how to continue calls after the summer break.
 
0) action items
 
http://www.w3.org/International/its/track/actions/open
 
 
1)  Open Data Management position statement - see latest state at
 
https://www.w3.org/International/its/wiki/Open_Data_Management_for_Public_Au
tomated_Translation_Services
 
 
2) ITS and XLIFF (placeholder)
 
 
3) MQM and ITS (placeholder) 
 
 
4) Wiki clean up and reasonable planning of topics for the next months
 
https://www.w3.org/International/its/wiki/
 
 
 
Anything else?
 
Best,
 
Felix
Attachments

application/vnd.openxmlformats-officedocument.wordprocessingml.document attachment: Open_Data_Management_for_Public_Automated_Translation_Services-revSerge.docx
Received on Wednesday, 16 July 2014 11:11:22 UTC