RE: MINUTES MLW-LT call 2012-07-26 from Pedro L. Díez Orzas on 2012-07-27 (public-multilingualweb-lt@w3.org from July 2012)

From: Pedro L. Díez Orzas <pedro.diez@linguaserve.com>
Date: Fri, 27 Jul 2012 13:45:18 +0200
To: "'Felix Sasaki'" <fsasaki@w3.org>, <public-multilingualweb-lt@w3.org>
Message-ID: <02f901cd6bed$4b974b40$e2c5e1c0$@linguaserve.com>
Thank you very much, Felix.

 

Sorry, but yesterday I had a last minute meeting with a client and could not  attend the telco.

 

I saw the minutes quickly and I only can contribute with this:

 

Ontologies, knowledge (conceptual) bases, and semantic networks are in Ling. Tech. completely different (but related) resources. Since linguistic detonation of meanings is needed for disambiguation, in Ling. Tech it is usually preferred semantic network resources (Wordnet/ EuroWordnet type) for lexical disambiguation, while ontologies (except those of linguistic semantic primitives) and knowledge bases are rather for conceptual categorization, and individuation, identification and data linking. Of course, the last two can be also used differently for disambiguation as well. Thus, having accessible the three different type of information for disambiguation tasks can be very useful, via URIs to stable and appropriate on line resources.

 

On the other hand, Felix, tell me if I can help you with the "Commentaries on ..." or "Best Practices".

 

Cheers,

Pedro

 

De: Felix Sasaki [mailto:fsasaki@w3.org] 
Enviado el: jueves, 26 de julio de 2012 18:46
Para: public-multilingualweb-lt@w3.org
Asunto: MINUTES MLW-LT call 2012-07-26

 

... are at http://www.w3.org/2012/07/26-mlw-lt-minutes.html and below as text.

 

Felix

 

   [1]W3C
 
      [1] http://www.w3.org/
 
                               - DRAFT -
 
                               MLW-LT WG
 
26 Jul 2012
 
   [2]Agenda
 
      [2] http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0254.html
 
   See also: [3]IRC log
 
      [3] http://www.w3.org/2012/07/26-mlw-lt-irc
 
Attendees
 
   Present
          dave, felix, des, dom, olaf, sebastian, yves, raphael,
          guiseppe, leroy, pablo, jirka
 
   Regrets
   Chair
          felix
 
   Scribe
          daveL, fsasaki
 
Contents
 
     * [4]Topics
         1. [5]named entity syntax discussion
         2. [6]its 20 draft publication
         3. [7]implementation committments
         4. [8]call for consensus
         5. [9]aob
     * [10]Summary of Action Items
     __________________________________________________________
 
named entity syntax discussion
 
   <fsasaki>
   [11]http://lists.w3.org/Archives/Public/public-multilingualweb-
   lt/2012Jul/0280.html
 
     [11] http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0280.html
 
   <fsasaki> raphael and guiseppe introducing themselves and NERD
 
   raphael: descibes NERD platform developed with giuseppeerizzo
 
   <raphael> NERD: nerd.eurecom.fr
 
   <raphael> s/nerd.eurecom.fr/[12]http://nerd.eurecom.fr/ <http://nerd.eurecom.fr/%5b12%5dhttp:/nerd.eurecom.fr/> 
 
     [12] http://nerd.eurecom.fr/
 
   sebastien: introduces himself as member of LOD2 project and
   developer of NIF and striving to make this compatible with
   ITS2.0
 
   <fsasaki>
   [13]http://lists.w3.org/Archives/Public/public-multilingualweb-
   lt/2012Jul/0280.html
 
     [13] http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0280.html
 
   tadej: named entitiy dc call for concesus distirbuted
 
   <raphael> NERD: a broker over numerous web APIs that perform
   Named Entity extraction, offers an ontology, an API and a Web
   UI for performing experiments
 
   tadej: related to terminology but not an extension due to
   its1.0 backward compatibility option
   ... two use cases: type of named entity and which named entity
   being mentioned
   ... disambiguation uses similar pattern but is a separate use
   case, pointing to a specific meaning in a semantic network
 
   <raphael> I would rather say that the disambiguation comes from
   a semantic network, or a knowledge base or a dataset (e.g.
   dbpedia) ... not an ontology since we are talking about
   instances
 
   tadej: examples included for XML, HTML, the latter with RDFa
   lite.
   ... microdata would be very similar
 
   pablomendes: asks if term resource is confusing because of
   different useages in language resource and web resources
   community
 
   <pablomendes> namespace or just "source"
 
   tadej: could use 'named graph', but perhaps a bit obscure.
   'name space' better but conflicts with xml namespace
   ... suggestion from floor 'source ref' may be better
 
   <Zakim> raphael, you wanted to ask entityTypeResourceRef should
   be a URI ... not a string, right?
 
   raphael: for disambiguate use term 'knowledge base' rather than
   'ontology'
   ... resource ref is mistakenly a string rather than URI
 
   tadej: yes its an error
 
   <pablomendes> global = stand off?
 
   tadej: also explains ITS pattern of local and global tag
   methods
 
   <pablomendes> local = inline?
 
   fasaki: comparable to CSS and has equivalent of cascading rules
 
   <fsasaki> some background here about "global" and "local"
   [14]http://www.w3.org/TR/2012/WD-its20-20120626/#basic-concepts
   -selection
 
     [14] http://www.w3.org/TR/2012/WD-its20-20120626/#basic-concepts-selection
 
   tadej: also there is inherntence, e.g. to specify dbpedia as
   source for all reference
 
   <pablomendes> wondering if "knowledge base", "thesaurus",
   "ontology", "semantic network" couldn't all be subsumed by
   "vocabulary"
 
   <pablomendes> since the type of knowledge representation is not
   important here. All those are essentially providers of URIs
   (vocabularies of globally unique identifers)
 
   tadej: also 3rd example mentions usage of rdfa lite so be
   consistent with simple usage and standoff annotation
 
   <raphael> yes pablo, but for types disambiguation, we should
   talk about vocabulary (ontology, thesaurus)
 
   tadej: providing mapping between simple rdfa markup and ITS
   markup
 
   <raphael> ... while for entities disambiguation, we should talk
   about datasets
 
   <raphael> e.g. dbpedia has 2 parts, the OWL ontology
   (dbpedia-owl) and the dataset part (much larger)
 
   <pablomendes> another name clash, I guess. the Linked Data
   community already took "vocabulary" as meaning schema
 
   <raphael> pablo, we are talking about the same thing ... I use
   vocabulary as in the Linked Data community
 
   tadej: answer pablo question, knowledge based is preferable to
   vocabulary
 
   pablomendes: knowledge is probalby fine for this user
   community, or perhaps entity vocabulary
 
   <Sebastian> identifiers ?
 
   <raphael> identifiers farm :-)
 
   tadej: identifiers could work, with example of instance,
   'ontologies' etc
 
   <fsasaki> link to terminology data category:
   [15]http://www.w3.org/TR/2012/WD-its20-20120626/#terminology
 
     [15] http://www.w3.org/TR/2012/WD-its20-20120626/#terminology
 
   tadej: responds to sebastian's question that neamed entity and
   disambiguation are separate from terminology in the affermative
 
   <Sebastian> [16]http://wiktionary.dbpedia.org/resource/dog
 
     [16] http://wiktionary.dbpedia.org/resource/dog
 
   <Sebastian>
   [17]http://wiktionary.dbpedia.org/page/dog-English-Verb-2fr
 
     [17] http://wiktionary.dbpedia.org/page/dog-English-Verb-2fr
 
   Sebastian: is an issue since repositories such as wiktionary as
   like knowledge bases
 
   <raphael> Sebastian, there is a relationship between
   [18]http://dbpedia.org/resource/Dog and
   [19]http://wiktionary.dbpedia.org/resource/dog ?
 
     [18] http://dbpedia.org/resource/Dog
     [19] http://wiktionary.dbpedia.org/resource/dog
 
   tadej: disambiguation lets you specify that type - entity or
   word
   ... as there are difference between inserting terminology link
   and entity link
 
   <Zakim> raphael, you wanted to ask what is the added value of
   using the nerd type as value of the typeof attribute (in RDFa)
   over the native type provided by an extractor
 
   raphael: rdfa example query about different vocab are being
   used
   ... and which tool generated it
 
   tadej: handled by separate data category
 
   <fsasaki> see that data category, textanalysis annotation,
   listed here
   [20]http://www.w3.org/International/multilingualweb/lt/wiki/Imp
   lementation_Commitments#Data_categories_2
 
     [20] http://www.w3.org/International/multilingualweb/lt/wiki/Implementation_Commitments#Data_categories_2
 
   <pablomendes> rephrase attempt: what are the relationships
   between LexicalEntry instances from Wiktionary and entities in
   DBpedia?
 
   <pablomendes> perhaps lexvo:lexicalization?
 
   <pablomendes>
   [21]http://www.lexvo.org/page/term/eng/lexicalization
 
     [21] http://www.lexvo.org/page/term/eng/lexicalization
 
   <Sebastian>
   [22]http://wiktionary.dbpedia.org/page/dog-English-Noun-1en
 
     [22] http://wiktionary.dbpedia.org/page/dog-English-Noun-1en
 
   Sebastien: responding to raphael, that in wiktionary the
   separation between lexical entry and concept is not alway
   clearly defined
   ... might be a tool artefact that cause confusion
 
   <raphael> [23]http://dbpedia.org/page/dog does not resolve BUT
   [24]http://dbpedia.org/page/Dog does !
 
     [23] http://dbpedia.org/page/dog
     [24] http://dbpedia.org/page/Dog
 
   Yves: title is not consistent - entity or name entity
 
   tadej: an error, will fix this
 
   Yves: example should be an entity pointer
 
   <fsasaki> FYI, "pointer" attribute means: pointing to existing
   information in the document
 
   tadej: will fix this
 
   raphael: what is relation between its draft and NIF
 
   tadej: there is some overlap and will be addressed in future,
   perhaps as a separate part or document
 
   <fsasaki> [25]http://www.w3.org/TR/2012/WD-its20-20120626/
 
     [25] http://www.w3.org/TR/2012/WD-its20-20120626/
 
   Sebastian: we are considering document some roundtrip scenario
   between ITS and NIF
 
   fsasaki: some initial work undertaken
 
   <pablomendes> wrt. its:disambigType = (word | entity) can't the
   distinction between word and entity be inferred from
   entityTypeRef? e.g. wiktionary:doc is a word, dbpedia:Dog is an
   entity
 
   pablo: is disambig type redundant with entity type ref?
 
   <Sebastian> no, it's meta-meta
 
   tadej: this is possible, but unlcear how to maintain this
   mapping and how users can infer this this
 
   <Sebastian> I really hav to learn how this speaker queue things
   work, where can I RTFM ?
 
   tadej: disambiguation use cases are often used in cases where
   text is short and lacks context
   ... and computational lingusitic community draw a clear
   distinction ebtween lexical and conceptual meaning
 
   <Zakim> raphael, you wanted to ask what are the implementations
   of "Internationalization Tag Set (ITS) Version 1.0" and the
   main diff between 1.0 and 2.0 ?
 
   <fsasaki>
   [26]http://www.w3.org/International/multilingualweb/lt/drafts/i
   ts20/its20.html#relation-to-its10
 
     [26] http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#relation-to-its10
 
   fsasaki: there is a seciton describing difference, mainly html5
   coverage and new daa categories
   ... its1.0 focussed more on classic i18n and l10n, its2.0 bring
   in more language technology integration
 
   raphael; which tools implement its1.0
 
   <fsasaki> okapi [27]http://okapi.sourceforge.net/
 
     [27] http://okapi.sourceforge.net/
 
   yves: there is rainbow in okapi framework as well as
   translation tools such as trados
 
   <fsasaki> trados translation tool
 
   <fsasaki> [28]http://okapi.opentag.com/
 
     [28] http://okapi.opentag.com/
 
   <Yves_>
   [29]http://www.opentag.com/okapi/wiki/index.php?title=ITS
 
     [29] http://www.opentag.com/okapi/wiki/index.php?title=ITS
 
   fsasaki: thanks everyone
 
   <pablomendes> thank you
 
   <scribe> ACTION: tadej to integrate this feedback [recorded in
   [30]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action01]
 
   <trackbot> Created ACTION-181 - Integrate this feedback [on
   Tadej Štajner - due 2012-08-02].
 
   <raphael>
   [31]http://www.w3.org/TR/2012/WD-its2req-20120524/#Automatic_en
   richment_of_the_source_content_with_named_entity_annotations
 
     [31] http://www.w3.org/TR/2012/WD-its2req-20120524/#Automatic_enrichment_of_the_source_content_with_named_entity_annotations
 
   fsasaki: aim to finalise entity related meta-data.
 
   <Yves_> another real-life implementation:
   [32]http://itstool.org/
 
     [32] http://itstool.org/
 
   fsasaki: this link has lots of other requirements, but we aim
   to keep things simple as possible to hit w3c timescale
   including november feature freeze
 
   <raphael> thanks all
 
its 20 draft publication
 
   <fsasaki>
   [33]http://www.w3.org/International/multilingualweb/lt/drafts/i
   ts20/its20.html
 
     [33] http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html
 
   fsasaki: any objections to publication - there are none
 
   <scribe> ACTION: fsasaki to publish update to working draft
   next week [recorded in
   [34]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action02]
 
   <trackbot> Created ACTION-182 - Publish update to working draft
   next week [on Felix Sasaki - due 2012-08-02].
 
   fsasaki: will plan another draft in latter half of august
 
implementation committments
 
   <fsasaki>
   [35]http://www.w3.org/International/multilingualweb/lt/wiki/Imp
   lementation_Commitments
 
     [35] http://www.w3.org/International/multilingualweb/lt/wiki/Implementation_Commitments
 
   fsaski: please keep implementation commmittments table uptodate
 
   yves: will try very hard to implement disambiguation and named
   entity data category
 
   <fsasaki>
   [36]http://www.w3.org/International/multilingualweb/lt/drafts/i
   ts20/its20.odd
 
     [36] http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.odd
 
   fsasaki: tells tadej to continue working on word version and
   then when finished editors will integrate into ITS draft doc
 
call for consensus
 
   fsasaki: any comment on parameter for rule and target points,
   are there further comments
   ... no disagreement
 
   <fsasaki> scribe: fsasaki
 
   yves: domain, inheritance discussion
   ... the discussion with declan
   ... what is the outcome?
 
   dave: it was about the fact that in practice with statistical
   MT
   ... some domains will be more important than others
   ... I was saying that the rules precedence is different than
   the domain precendence declan was talking about
   ... in statistical MT, you have sometimes domain precedence
 
   yves: my question was about the domain precedence attribute
 
   felix: yves is asking about the impliciation for implementing
   domain
 
   dave: do we want to put this as a new optional attribute, e.g.
   these are the ones that represent the primary domain
   ... it is another optional attribute, need to get declan's
   feedback what's best
   ... in practice it will not be a definite instruction, not on
   the side of MT providers
   ... it is a hint, not a mechanical choice
 
   yves: looking at the example of domain rules
   ... usage a) and b)
   ... you have a domain precedence attribute and criminal law and
   medical
   ... you have a domain poitner that says where to get the
   information
   ... the precedence is in the rule
   ... but how do you know which value to use
   ... it is not listed in the domain
   ... so what is the relationship
   ... do we need a domain precedence pointer
 
   dave: not sure we really need it
   ... need feedback from declan
   ... a separate MT provider may do other decisions
   ... a company like adobe might use it
   ... section doing the content, versus the one for MT training
   ... looked whether it is actually needed - it's a borderline
   use case
 
   yves: seems to be a border case
 
   dave: I agree
 
   <scribe> ACTION: dave6 to contact declan and thomas about
   domain new attribute proposal [recorded in
   [37]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action03]
 
   <trackbot> Sorry, couldn't find user - dave6
 
   <scribe> ACTION: felix to integrate parameters for rules and
   target pointer into the spec [recorded in
   [38]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action04]
 
   <trackbot> Created ACTION-183 - Integrate parameters for rules
   and target pointer into the spec [on Felix Sasaki - due
   2012-08-02].
 
aob
 
   dave: we are adding more content explaining things
   ... we end up putting in descriptions of standoff markup that
   we are pointing to
   ... are we happy to put that in the document?
 
   <daveL> scribe: daveL
 
   fsasaki: responding to query on non normative standoff markup
   exmaples - we should definitely collect this and then decide
   how best ot present this
 
   <fsasaki> [39]http://www.w3.org/TR/xml-i18n-bp/
 
     [39] http://www.w3.org/TR/xml-i18n-bp/
 
   fsasaki: next year we have more opportunity for separate best
   practice best practices as in ITS1.0
   ... and can take other things into account, such as use of meta
   data in more complex workflow scneairo
 
   <Zakim> omstefanov, you wanted to discuss materials above and
   in additions to ITS 2.0
 
   omstefanov: many other fields include commentaries on normative
   or legal content, including exmaples and implementation
   quesitons etc
 
   <fsasaki> ACTION: felix to prepare a place for BP material
   [recorded in
   [40]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action05]
 
   <trackbot> Created ACTION-184 - Prepare a place for BP material
   [on Felix Sasaki - due 2012-08-02].
 
   <Zakim> Des, you wanted to discuss agenda for prague f2f
 
   <Jirka> Logistics page is at
   [41]http://www.w3.org/International/multilingualweb/lt/wiki/Pra
   gueSep2012
 
     [41] http://www.w3.org/International/multilingualweb/lt/wiki/PragueSep2012
 
   <fsasaki> ACTION: felix to prepare agenda draft for prague
   [recorded in
   [42]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action06]
 
   <trackbot> Created ACTION-185 - Prepare agenda draft for prague
   [on Felix Sasaki - due 2012-08-02].
 
   <omstefanov> Nevertheless, I want to make one more pitch to use
   the more globally understood term, "Commentaries on ..." rather
   than "Best Practices" which, even if what you says applies,
   Felix, usually is understood in a more restricted sense.
 
Summary of Action Items
 
   [NEW] ACTION: dave6 to contact declan and thomas about domain
   new attribute proposal [recorded in
   [43]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action03]
   [NEW] ACTION: felix to integrate parameters for rules and
   target pointer into the spec [recorded in
   [44]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action04]
   [NEW] ACTION: felix to prepare a place for BP material
   [recorded in
   [45]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action05]
   [NEW] ACTION: felix to prepare agenda draft for prague
   [recorded in
   [46]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action06]
   [NEW] ACTION: fsasaki to publish update to working draft next
   week [recorded in
   [47]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action02]
   [NEW] ACTION: tadej to integrate this feedback [recorded in
   [48]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action01]
 
   [End of minutes]
     __________________________________________________________
 
 
    Minutes formatted by David Booth's [49]scribe.perl version
    1.136 ([50]CVS log)
    $Date: 2012/07/26 16:44:03 $
 
     [49] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
     [50] http://dev.w3.org/cvsweb/2002/scribe/
Received on Friday, 27 July 2012 11:46:48 UTC