- From: Felix Sasaki <fsasaki@w3.org>
- Date: Thu, 17 Jan 2013 10:03:37 +0100
- To: "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>
- Message-ID: <50F7BE69.4060209@w3.org>
Hi all,
minutes are at
http://www.w3.org/2013/01/16-mlw-lt-minutes.html
and below as text. I hope that I got the attendance right, please check.
At Christian: for the "disambiguation vs. term" discussion, see
http://www.w3.org/2013/01/16-mlw-lt-minutes.html#item06
For all people attending prague, see
http://www.w3.org/2013/01/16-mlw-lt-minutes.html#item10
and esp. the objectives
http://www.w3.org/International/multilingualweb/lt/wiki/PragueJan2013f2f#Objectives
which require some preparations from you.
Best,
Felix
[1]W3C
[1] http://www.w3.org/
- DRAFT -
MLW-LT WG
16 Jan 2013
[2]Agenda
[2] http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Jan/0090.html
See also: [3]IRC log
[3] http://www.w3.org/2013/01/16-mlw-lt-irc
Attendees
Present
felix, karl, Marcis, philr, leroy, Ankit, shaunm, joerg,
Clemens, Jirka, dave, Des, mdelolmo, renatb, Yves,
guiseppe, milan, tadej, pablo, dF, Naoto, olaf
Regrets
dom, christian
Chair
felix
Scribe
daveL, fsasaki
Contents
* [4]Topics
1. [5]roll call
2. [6]Meeting time
3. [7]state of XLIFF mapping
4. [8]New value for localization quality type
"conformance"
5. [9]Regular expression change
6. [10]Disambiguation and term
7. [11]annotorsRef
8. [12]provenance record ordering
9. [13]Test suite
10. [14]prague f2f
11. [15]xliff mapping implementation update (with David on
the call)
12. [16]metadata harvesting
* [17]Summary of Action Items
__________________________________________________________
roll call
<fsasaki> checking attendance
<fsasaki> scribe: daveL
<fsasaki>
[18]http://lists.w3.org/Archives/Public/public-multilingualweb-
lt/2013Jan/0090.html
[18] http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Jan/0090.html
Meeting time
<fsasaki> [19]http://www.doodle.com/pn6xa86rfbypmd2k
[19] http://www.doodle.com/pn6xa86rfbypmd2k
felix; there is no apparent slot that works. felix willl
distribute a weekly alternating proposal
state of XLIFF mapping
<fsasaki> scribe: fsasaki
dave: haven't updated the mapping page a lot
... there is more work to be done to formalize the mapping
... and come up with examples
... I think we won't to focus on XLIFF 1.2 mapping first
... we were hoping that XLIFF 2 would be stable, but there is a
delay
... focus on XLIFF 1.2 also helps with putting a demonstrator
together
yves: dave summarized everythign right
... in okapi we implemented ITS mapping on what we have
... it is partially implemented, ongoing
dave: we will come back shortly on that
... wrt to interop between solas and CMS lion, also using okapi
... with the preparation for rome
phil: it is now on our critical path for our implementation
... david said he would have a prototype a few weeks ago
... even if there is nothing final
... even if we would have a rough direction
... e.g. yves said that with xliff 1.2, he would use mrk markup
... even if we had directions what is easily acceptable
... otherwise it could hold up my implemetnation
yves: the xliff 1.2 mapping is what we used for implementations
... most of the time it made sense
... we have tackled some of the standoff stuff
... it is also in the git repository (for okapi, scribe
assumes)?
<Yves_> yes
phil: provenance and loc quality issue, rating are relevant for
us here
<Yves_> Location:
[20]http://code.google.com/p/okapi/source/list?name=html5
[20] http://code.google.com/p/okapi/source/list?name=html5
phil: Yves' page for 1.2. we can certainly use that as our
direction
dave: will talk to david tomorrow about that
phil: tx
New value for localization quality type "conformance"
<daveL> scribe: daveL
felix: asks if anyone has further thoughts, or supported for
this new type
Regular expression change
felix: no respeonses yet
shaun: no update on this
<fsasaki> ACTION: shaun to work on regex for validating regex
subset proposal [recorded in
[21]http://www.w3.org/2013/01/16-mlw-lt-minutes.html#action02]
<trackbot> Created ACTION-385 - Work on regex for validating
regex subset proposal [on Shaun McCance - due 2013-01-23].
Disambiguation and term
felix: has been discussed in response to christian comment
... any further comments
marcis: what is the goal?
felix: christian suggested merging term and disambig data
categories
... but response was that both had distinct use cases, that
could merge by are valid individually
marcis: would not want to drop data category, term is easier to
implement and purpose is clear
... not so clear on disambiguation category, in terms of what
is possible to do with this
... for example there may be other types that might be useful
in the disambiguation use case
... and doing term management with disambig would make it very
heavy
... so there might need to be more atribute specifically for
named entity
... referencing input form W3C india recvied today
tadej: motivation for separate data category was because it
covered some use cases that fell out of the scope of
terminology
... by providing some additional context
... but do see that there is some commonality
... Also term must remain to keep compatibility with named
entity 1
correction, > with terminology in ITS1
jörg: still in favour of having the two data categories
scribe: since dismabiguation can cover many other tasks in
content or NLP processing
... whereas term is more specific
pedro: the sort of text we mark up is different in both cases
so it makes sense to keep the distinction
tadej; agree granularities are quite limiting, or should we
have more identifiers to support this
scribe: but this might be more comlicating
jorge: yes this would be more complicated, clearer as it is
<fsasaki> [22]http://tinyurl.com/its20-testsuite-dashboard
[22] http://tinyurl.com/its20-testsuite-dashboard
felix: christian will dial in to f2f to discuss this and
resolve the topic next week
... we also need to consider number of implementations, which
are not so many, when considering any possible merger
Des: agree with jorge, keep them separate as they are distinct
use cases
jorge: clarified, attributes as defined currently are clearer
than making them more fine grained
felix: reminds that W3C process requires responding which
involves some work
<Yves_> could we talk about annotorsRef
[23]https://www.w3.org/International/multilingualweb/lt/track/i
ssues/71 a bit during this call?
[23] https://www.w3.org/International/multilingualweb/lt/track/issues/71
felix: replying to a question from Dave: the current number of
comments received is good
annotorsRef
yves: for two data categories, proc and locqualiss, can have
information from multiple annotators, but we have no way of
doing this for annotatorRef
... for current implementation, we assume the most recent
annotator is the correct one, but this is not ideal
... provenance especially has multiple items and requires
annotationRef
<fsasaki> daveL: will look into this thread
<scribe> scribe: daveL
provenance record ordering
phil: lets talk about the ordering of proveance
<Yves_> provenance data category
[24]https://www.w3.org/International/multilingualweb/lt/track/i
ssues/72
[24] https://www.w3.org/International/multilingualweb/lt/track/issues/72
<fsasaki>
[25]http://lists.w3.org/Archives/Public/public-multilingualweb-
lt/2013Jan/0090.html
[25] http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Jan/0090.html
<Arle_> I am back on the call.
<fsasaki>
[26]http://lists.w3.org/Archives/Public/public-multilingualweb-
lt/2013Jan/0061.html
[26] http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Jan/0061.html
<fsasaki>
[27]http://lists.w3.org/Archives/Public/public-multilingualweb-
lt/2013Jan/0066.html
[27] http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Jan/0066.html
felix: this was a discussion of whether there was any
implication between ordering and time of record
<fsasaki>
[28]http://lists.w3.org/Archives/Public/public-multilingualweb-
lt/2013Jan/0055.html
[28] http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Jan/0055.html
<fsasaki> (mails related to the discussion)
phil: asks whether there should be a lack of date stamp
<fsasaki> daveL: a date stamp was discussed
<fsasaki> .. there is two aspects:
<fsasaki> .. a lot of original requirements didn't have a
strong need for a time stamp
<fsasaki> .. the original requirement was about identifying
rich enough so that we can differentiate
<fsasaki> .. see e.g. "agent provenance" that used to include
taht
<fsasaki> .. the 2nd aspect:
<fsasaki> .. we discussed whether the order of the proveancen
records are added is significant
<fsasaki> .. but from an implementation point of view it is
again compliciated
<fsasaki> .. and there hadn't be much a call for this during
requirements gathering
<fsasaki> .. "time" also has various aspects: start of a
translation, finish, duration, ...
<fsasaki> .. it is also a point that the provenance wg in w3c
had addressed
<fsasaki> .. so we just provide identifiers of who made the
translation and revision
<fsasaki> .. for knowing more there is a the provenance model
<fsasaki> .. more = more about time
<fsasaki> .. so in summary, there was no big requirement to
have a time stamp
<fsasaki> .. and *if* you want to do that, you can use the w3c
prov model
<fsasaki> .. I'll reply to that mail thread
<fsasaki> pablo: I think provenance can stay as is
<fsasaki> .. adding a time stamp can be useful and interesint -
if every implementer is fine with that i'm fine too
<scribe> scribe: daveL
felix: adding tiestamp is a substantive change and would
require another call, plus tests etc
Test suite
<fsasaki>
[29]http://lists.w3.org/Archives/Public/public-multilingualweb-
lt/2013Jan/0090.html
[29] http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Jan/0090.html
felix: from this week on be aware that people should stop using
the google docs and they update the test suite master
themselves
<fsasaki>
[30]http://lists.w3.org/Archives/Public/public-multilingualweb-
lt/2012Dec/0087.html
[30] http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Dec/0087.html
felix: we need still some input on tests still related to
assertion (MUSTs0 which need suggestion for test for them
prague f2f
<fsasaki>
[31]http://www.w3.org/International/multilingualweb/lt/wiki/Pra
gueJan2013f2f
[31] http://www.w3.org/International/multilingualweb/lt/wiki/PragueJan2013f2f
<fsasaki>
[32]http://www.w3.org/International/multilingualweb/lt/wiki/Pra
gueJan2013f2f#Objectives
[32] http://www.w3.org/International/multilingualweb/lt/wiki/PragueJan2013f2f#Objectives
felix: thanks to jirka for organising this
<fsasaki>
[33]http://www.w3.org/International/multilingualweb/lt/wiki/Pra
gueJan2013f2f#Participants
[33] http://www.w3.org/International/multilingualweb/lt/wiki/PragueJan2013f2f#Participants
jirka: is you are not yet register, please do so asap. Numbers
of people need to be known for wifi etc.
felix: also need to know in advance when people want to dial in
for organising the agenda
<fsasaki>
[34]http://www.w3.org/International/multilingualweb/lt/wiki/Pra
gueJan2013f2f#Objectives
[34] http://www.w3.org/International/multilingualweb/lt/wiki/PragueJan2013f2f#Objectives
felix: going through objectives
<fsasaki>
[35]http://www.w3.org/International/multilingualweb/lt/wiki/Use
_cases_-_high_level_summary
[35] http://www.w3.org/International/multilingualweb/lt/wiki/Use_cases_-_high_level_summary
felix: in particular the relationship between the different
posters and links to where people can access them and update
high level summary, adding any new use cases
<fsasaki> daveL: some time to discuss preparing EU project
review?
felix: also brainstorm on activities for rest of year and new
projects and synergy between them
... the Rome preparation should cover that.
<fsasaki> scribe: fsasaki
<omstefanov> as I will not be able to take part in the f2f
Prague, but definitely intend to come to Rome, so please make
sure preps for Rome are recording in writing
xliff mapping implementation update (with David on the call)
david: phil asked on that, we got good comments from xyz
... status of xliff mapping - only written piece is xliff
mapping wiki
<dF>
[36]http://www.w3.org/International/multilingualweb/lt/wiki/XLI
FF_Mapping
[36] http://www.w3.org/International/multilingualweb/lt/wiki/XLIFF_Mapping
david: will work on this today, yesterday / today was EC
deadline
... we should publish this as a note / PC
... what is the editorial setup for such a note?
... we will need an additional namespace itsx
felix: update on implementation prototype?
david: solas is consuming ITS2 categories
... like OKAPI does
... that is being tested as part of the test suite
... that is consumed by various components of solas
architecture
... one is an MT broker
... works with different MT systems
... depends on the MT systems whether they can deal with ITS
metadata
... moravia is contributing to that
... m4loc can be used as middleware
... in our current prototype the mt services exposes the m4loc
service
... from the deliverable - open source xliff roundtripp
... the okapi filter interprets the ITS decoration
... then the mapping in the wiki is used
... it is consumed by middle ware open source component
felix: would be good to see a demo
david: will do, in prague and in rome
metadata harvesting
ankit: we are waiting for some sort of data from cocomore
felix: what data?
ankit: we said that cocomore would provide us with annotated
data
ankit will provide module by prague f2f
pedro: will have annotated data from spanish client
... client is the spanish gov tax office
... they will annotate with ITS metadata for this show case
... spanish content in HTML5
... we will generate english content
... and annotate it in the output of the real time system
felix: so ankit could later use the data to test the module?
ankit: training data is as much as you can get
pedro: annotated data from cocomore is html content
... we will generate content in chinese and french
... so ankit can take that into account chinese, french, german
in his system
... and spanish
... this will be german to english, german to french, german to
chinese, german to spanish
<Pedro> Showcase WP3 (Cocomore-Linguaserve) is German to
Chinese and German to French
<Clemens> right!
<Pedro> Showcase WP4 (Linguaserve-Lucy-DCU) is the full demo
Spanish to English, and partial demo Spanish to French and
Spanish to German
thanks for everybody for staying longer, meeting adjourned
Summary of Action Items
[NEW] ACTION: shaun to work on regex for validating regex
subset proposal [recorded in
[37]http://www.w3.org/2013/01/16-mlw-lt-minutes.html#action02]
[End of minutes]
__________________________________________________________
Minutes formatted by David Booth's [38]scribe.perl version
1.137 ([39]CVS log)
$Date: 2013-01-17 09:00:44 $
[38] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
[39] http://dev.w3.org/cvsweb/2002/scribe/
Received on Thursday, 17 January 2013 09:04:03 UTC