[Minutes] MLW-LT call 2012-11-05 ...

are at http://www.w3.org/2012/11/05-mlw-lt-minutes.html and below as text.
Apologies for the delay.

Felix

   [1]W3C

      [1] http://www.w3.org/

                               - DRAFT -

            MultilingualWeb-LT Working Group Teleconference

05 Nov 2012

   See also: [2]IRC log

      [2] http://www.w3.org/2012/11/05-mlw-lt-irc

Attendees

   Present
          Pedro, Marcis, daveL, Jirka, DomJones, leroy, Yves_,
          Ankit, tadej, omstefanov, kfritsche, Arle, fantasai,
          shaunm, marcis, naoto, milan

   Regrets
          davidF, Felix

   Chair
          dave

   Scribe
          daveL

Contents

     * [3]Topics
         1. [4]agenda
         2. [5]Standoff markup
         3. [6]its-tools
     * [7]Summary of Action Items
     __________________________________________________________

agenda

   [8]http://lists.w3.org/Archives/Public/public-multilingualweb-l
   t/2012Nov/0024.html

      [8] http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Nov/0024.html

   [9]http://lists.w3.org/Archives/Public/public-multilingualweb-l
   t/2012Nov/0026.html

      [9] http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Nov/0026.html

   topic; Doodle poll about virtual f2f

   <tadej> [10]http://doodle.com/heh7k59h7vkvnv88#table

     [10] http://doodle.com/heh7k59h7vkvnv88#table

   <tadej> daveL: poll shows 27th and 28th to be both good
   candidates

   <tadej> ... I would suggest taking the 27th and 28th, having
   both around 3 hour calls in the afternoon

   <tadej> ... howerver, we should deal with more specific issues
   beforehand

   <tadej> daveL: Tuesday, Nov 20th is also a good candidate

   <tadej> ACTION: daveL to confirm November 20, 27 and 28 as
   virtual session dates [recorded in
   [11]http://www.w3.org/2012/11/05-mlw-lt-minutes.html#action01]

   <trackbot> Created ACTION-278 - Confirm November 20, 27 and 28
   as virtual session dates [on David Lewis - due 2012-11-12].

   topic; upcoming meetings

   [12]http://www.w3.org/International/multilingualweb/lt/wiki/Mai
   n_Page#Upcoming

     [12] http://www.w3.org/International/multilingualweb/lt/wiki/Main_Page#Upcoming

   <tadej> daveL: checking if the schedule makes sense - so far
   Prague 23-24 Jan, Rome 12-13 March, Bled 7-8 May, and Madrid
   still unspecified

   <tadej> daveL: as for events, there's a GALA event, LocWorld,
   the WWW conference in Rio, and the LRC conference in Limerick

   <tadej> Yves_: the only thing we need to fix is the dates for
   the Madrid meeting, since July is a holiday month

   <Arle> We may be able to get on the GALA program. I will know
   more soon.

   <tadej> Pedro: For July, the sooner the better, ideally first
   week

   <tadej> ... or even last week of June

   <tadej> ACTION: daveL to open doodle poll for Madrid dates (end
   June - beginning July) [recorded in
   [13]http://www.w3.org/2012/11/05-mlw-lt-minutes.html#action02]

   <trackbot> Created ACTION-279 - Open doodle poll for Madrid
   dates (end June - beginning July) [on David Lewis - due
   2012-11-12].

   <Arle> (Separate from what Pedro has already submitted, which
   is a great start.)

Standoff markup

   topic; standoff markup

   [14]http://lists.w3.org/Archives/Public/public-multilingualweb-
   lt/2012Nov/0019.html

     [14] http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Nov/0019.html

   <tadej> Yves_: we should use a single root element, like
   its:standOffList (or similarly named). the inclusion mechanism
   would be via the script element, either inline or separate file

   <tadej> ...given the example, it would be better to split the
   standoff into two separate <script>-s, and have the script
   element id match the standoff list ids.

   <tadej> Pedro: the external files can be problematic in cases
   with real-time translation

   <tadej> daveL: do you think the its:rules elements could be the
   enclosing element?

   <tadej> Yves_: since we need to point to multiple
   its:standofflists, they can't be the root element, since they
   could exist in the same file; its:rules could be a root.

   <tadej> daveL: could you correct the schema so it takes this
   into account?

   <tadej> Yves_: mixing rules and standoff can get messy

   <tadej> daveL: its:rules is easy from the conformance point of
   view, easier to explain, although there may be confusion

   <tadej> Jirka: there's conceptual overload with this - we'd be
   declaring its:rules, and it wouldn't contain actual rules, but
   standoff info

   <tadej> daveL: let's summarize having a single element
   its:standoffList having an id attribute which matches the
   script element's id.

   <tadej> ... in external files, we could have multiple standoff
   lists

   <tadej> ACTION: Yves_ to edit the spec to unique standoff
   markup [recorded in
   [15]http://www.w3.org/2012/11/05-mlw-lt-minutes.html#action03]

   <trackbot> Sorry, couldn't find Yves_. You can review and
   register nicknames at
   <[16]http://www.w3.org/International/multilingualweb/lt/track/u
   sers>.

     [16] http://www.w3.org/International/multilingualweb/lt/track/users%3E.

   <tadej> ACTION: Yves to edit the spec to unique standoff markup
   [recorded in
   [17]http://www.w3.org/2012/11/05-mlw-lt-minutes.html#action04]

   <trackbot> Created ACTION-280 - Edit the spec to unique
   standoff markup [on Yves Savourel - due 2012-11-12].

its-tools

   [18]http://lists.w3.org/Archives/Public/public-multilingualweb-
   lt/2012Nov/0004.html

     [18] http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Nov/0004.html

   <tadej> daveL: Marcis sent an update consolidating MT
   confidence and TA Annotation into simpler definitions

   <tadej> ... there's still an open issue on whether defining
   its:tools should be compulsory for these two data categories.
   any opinions?

   <tadej> Yves_: sounds reasonable

   <tadej> daveL: I'll modify the text and make it compulsory.

   <tadej> daveL: Marcis also pointed out that several tools could
   process a fragment of text, which makes things confusing. it's
   different than MT, since you're annotating an annotation.

   <tadej> ... should we then just apply the its:tool to those
   data categories than have it as a separate data category?

   <tadej> tadej: disambiguation could survive that, it's
   equivalent

   [19]http://lists.w3.org/Archives/Public/public-multilingualweb-
   lt/2012Nov/0006.html

     [19] http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Nov/0006.html

   <scribe> scribe: daveL

   tadej: is currently updating its-tools, looking at use of
   non-its annotations

   <tadej> daveL: right now we have a mechanism to identify to
   which data category it applies to, allowing for user-defined
   names

   <tadej> daveL: ... since you're borrowing the mechanism anyway,
   you're out of conformance anyway

   <tadej> daveL: we could remove it, since we don't have a formal
   extension mechanism

   <Marcis> I hear you, I just cannot say anything

   <tadej> tadej: if we define a per-datacategory confidence
   attribute, how to express multi-valued attributes?

   <Marcis> I mean, if the domains are automatically identified,
   then you will have a confidence (if the systems will return
   probabilistic results)

   <Marcis> As tadej said - the weighted mechanism says that there
   is a confidence

   <tadej> tadej: It boils down to whether that number is useful
   for the consumer

   <Marcis> The categories (not in exact names...) that I see
   requiring the confidence are: MT, Terminology, Domain
   segmentation tools (are there any currently used by the MT use
   cases?), Named Entity Recognition (currently in Disambiguation,
   right?), others (?)

   <tadej> ACTION: daveL to ask for use cases of data
   category-specific confidence scores [recorded in
   [20]http://www.w3.org/2012/11/05-mlw-lt-minutes.html#action05]

   <trackbot> Created ACTION-281 - Ask for use cases of data
   category-specific confidence scores [on David Lewis - due
   2012-11-12].

   <Ankit> w.r.t. confidence scores in MT, they are are mainly
   used in a post-editing environment, i.e. when a human
   translator uses these scores to determine which outputs of a MT
   system they want to correct..

   <tadej> tadej: disambiguation can produce scores, but not
   commonly used

   <tadej> daveL: its:tools has its own element, the
   its:standOffList - we should describe it how it works within a
   script element, so it's as similar as possible to the XML
   markup.

Summary of Action Items

   [NEW] ACTION: daveL to ask for use cases of data
   category-specific confidence scores [recorded in
   [21]http://www.w3.org/2012/11/05-mlw-lt-minutes.html#action05]
   [NEW] ACTION: daveL to confirm November 20, 27 and 28 as
   virtual session dates [recorded in
   [22]http://www.w3.org/2012/11/05-mlw-lt-minutes.html#action01]
   [NEW] ACTION: daveL to open doodle poll for Madrid dates (end
   June - beginning July) [recorded in
   [23]http://www.w3.org/2012/11/05-mlw-lt-minutes.html#action02]
   [NEW] ACTION: Yves to edit the spec to unique standoff markup
   [recorded in
   [24]http://www.w3.org/2012/11/05-mlw-lt-minutes.html#action04]
   [NEW] ACTION: Yves_ to edit the spec to unique standoff markup
   [recorded in
   [25]http://www.w3.org/2012/11/05-mlw-lt-minutes.html#action03]

   [End of minutes]
     __________________________________________________________


    Minutes formatted by David Booth's [26]scribe.perl version
    1.137 ([27]CVS log)
    $Date: 2012/11/10 05:36:17 $

     [26] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
     [27] http://dev.w3.org/cvsweb/2002/scribe/

Received on Saturday, 10 November 2012 05:37:56 UTC