W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > July 2012

Re: MINUTES MLW-LT call 2012-07-26

From: Tadej Štajner <tadej.stajner@ijs.si>
Date: Fri, 27 Jul 2012 14:33:06 +0200
Message-ID: <50128A82.2010807@ijs.si>
To: public-multilingualweb-lt@w3.org
Hi, Pedro,

we also talked about this distinction of lexical meaning vs. conceptual 
meaning. In order to minimize the number of distinct attributes we'd 
introduce, both are expressed with disambigIdentRef, and the distinction 
is made with a disambigType attribute with two possible values:

its:disambigType="word" means word sense disambiguation, and 
its:disambigType="entity" means concept disambiguation. Ideally, 
introducing this shouldn't be necessary, since it could be derived from 
the KB/ontology/semantic network directly, but we couldn't come up a 
good mechanism to implement that. Ideas on this part are welcome :)

-- Tadej


On 27. 07. 2012 13:45, Pedro L. Díez Orzas wrote:
>
> Thank you very much, Felix.
>
> Sorry, but yesterday I had a last minute meeting with a client and 
> could not  attend the telco.
>
> I saw the minutes quickly and I only can contribute with this:
>
> Ontologies, knowledge (conceptual) bases, and semantic networks are in 
> Ling. Tech. completely different (but related) resources. Since 
> linguistic detonation of meanings is needed for disambiguation, in 
> Ling. Tech it is usually preferred semantic network resources 
> (Wordnet/ EuroWordnet type) for lexical disambiguation, while 
> ontologies (except those of linguistic semantic primitives) and 
> knowledge bases are rather for conceptual categorization, and 
> individuation, identification and data linking. Of course, the last 
> two can be also used differently for disambiguation as well. Thus, 
> having accessible the three different type of information for 
> disambiguation tasks can be very useful, via URIs to stable and 
> appropriate on line resources.
>
> On the other hand, Felix, tell me if I can help you with the 
> "Commentaries on ..." or "Best Practices".
>
> Cheers,
>
> Pedro
>
> *De:*Felix Sasaki [mailto:fsasaki@w3.org]
> *Enviado el:* jueves, 26 de julio de 2012 18:46
> *Para:* public-multilingualweb-lt@w3.org
> *Asunto:* MINUTES MLW-LT call 2012-07-26
>
> ... are at http://www.w3.org/2012/07/26-mlw-lt-minutes.html and below 
> as text.
>
> Felix
>
>     [1]W3C
>   
>        [1]http://www.w3.org/
>   
>                                 - DRAFT -
>   
>                                 MLW-LT WG
>   
> 26 Jul 2012
>   
>     [2]Agenda
>   
>        [2]http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0254.html
>   
>     See also: [3]IRC log
>   
>        [3]http://www.w3.org/2012/07/26-mlw-lt-irc
>   
> Attendees
>   
>     Present
>            dave, felix, des, dom, olaf, sebastian, yves, raphael,
>            guiseppe, leroy, pablo, jirka
>   
>     Regrets
>     Chair
>            felix
>   
>     Scribe
>            daveL, fsasaki
>   
> Contents
>   
>       * [4]Topics
>           1. [5]named entity syntax discussion
>           2. [6]its 20 draft publication
>           3. [7]implementation committments
>           4. [8]call for consensus
>           5. [9]aob
>       * [10]Summary of Action Items
>       __________________________________________________________
>   
> named entity syntax discussion
>   
>     <fsasaki>
>     [11]http://lists.w3.org/Archives/Public/public-multilingualweb-
>     lt/2012Jul/0280.html
>   
>       [11]http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0280.html
>   
>     <fsasaki> raphael and guiseppe introducing themselves and NERD
>   
>     raphael: descibes NERD platform developed with giuseppeerizzo
>   
>     <raphael> NERD:nerd.eurecom.fr  <http://nerd.eurecom.fr>
>   
>     <raphael> s/nerd.eurecom.fr/[12]http://nerd.eurecom.fr/  <http://nerd.eurecom.fr/%5b12%5dhttp:/nerd.eurecom.fr/>
>   
>       [12]http://nerd.eurecom.fr/
>   
>     sebastien: introduces himself as member of LOD2 project and
>     developer of NIF and striving to make this compatible with
>     ITS2.0
>   
>     <fsasaki>
>     [13]http://lists.w3.org/Archives/Public/public-multilingualweb-
>     lt/2012Jul/0280.html
>   
>       [13]http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0280.html
>   
>     tadej: named entitiy dc call for concesus distirbuted
>   
>     <raphael> NERD: a broker over numerous web APIs that perform
>     Named Entity extraction, offers an ontology, an API and a Web
>     UI for performing experiments
>   
>     tadej: related to terminology but not an extension due to
>     its1.0 backward compatibility option
>     ... two use cases: type of named entity and which named entity
>     being mentioned
>     ... disambiguation uses similar pattern but is a separate use
>     case, pointing to a specific meaning in a semantic network
>   
>     <raphael> I would rather say that the disambiguation comes from
>     a semantic network, or a knowledge base or a dataset (e.g.
>     dbpedia) ... not an ontology since we are talking about
>     instances
>   
>     tadej: examples included for XML, HTML, the latter with RDFa
>     lite.
>     ... microdata would be very similar
>   
>     pablomendes: asks if term resource is confusing because of
>     different useages in language resource and web resources
>     community
>   
>     <pablomendes> namespace or just "source"
>   
>     tadej: could use 'named graph', but perhaps a bit obscure.
>     'name space' better but conflicts with xml namespace
>     ... suggestion from floor 'source ref' may be better
>   
>     <Zakim> raphael, you wanted to ask entityTypeResourceRef should
>     be a URI ... not a string, right?
>   
>     raphael: for disambiguate use term 'knowledge base' rather than
>     'ontology'
>     ... resource ref is mistakenly a string rather than URI
>   
>     tadej: yes its an error
>   
>     <pablomendes> global = stand off?
>   
>     tadej: also explains ITS pattern of local and global tag
>     methods
>   
>     <pablomendes> local = inline?
>   
>     fasaki: comparable to CSS and has equivalent of cascading rules
>   
>     <fsasaki> some background here about "global" and "local"
>     [14]http://www.w3.org/TR/2012/WD-its20-20120626/#basic-concepts
>     -selection
>   
>       [14]http://www.w3.org/TR/2012/WD-its20-20120626/#basic-concepts-selection
>   
>     tadej: also there is inherntence, e.g. to specify dbpedia as
>     source for all reference
>   
>     <pablomendes> wondering if "knowledge base", "thesaurus",
>     "ontology", "semantic network" couldn't all be subsumed by
>     "vocabulary"
>   
>     <pablomendes> since the type of knowledge representation is not
>     important here. All those are essentially providers of URIs
>     (vocabularies of globally unique identifers)
>   
>     tadej: also 3rd example mentions usage of rdfa lite so be
>     consistent with simple usage and standoff annotation
>   
>     <raphael> yes pablo, but for types disambiguation, we should
>     talk about vocabulary (ontology, thesaurus)
>   
>     tadej: providing mapping between simple rdfa markup and ITS
>     markup
>   
>     <raphael> ... while for entities disambiguation, we should talk
>     about datasets
>   
>     <raphael> e.g. dbpedia has 2 parts, the OWL ontology
>     (dbpedia-owl) and the dataset part (much larger)
>   
>     <pablomendes> another name clash, I guess. the Linked Data
>     community already took "vocabulary" as meaning schema
>   
>     <raphael> pablo, we are talking about the same thing ... I use
>     vocabulary as in the Linked Data community
>   
>     tadej: answer pablo question, knowledge based is preferable to
>     vocabulary
>   
>     pablomendes: knowledge is probalby fine for this user
>     community, or perhaps entity vocabulary
>   
>     <Sebastian> identifiers ?
>   
>     <raphael> identifiers farm :-)
>   
>     tadej: identifiers could work, with example of instance,
>     'ontologies' etc
>   
>     <fsasaki> link to terminology data category:
>     [15]http://www.w3.org/TR/2012/WD-its20-20120626/#terminology
>   
>       [15]http://www.w3.org/TR/2012/WD-its20-20120626/#terminology
>   
>     tadej: responds to sebastian's question that neamed entity and
>     disambiguation are separate from terminology in the affermative
>   
>     <Sebastian> [16]http://wiktionary.dbpedia.org/resource/dog
>   
>       [16]http://wiktionary.dbpedia.org/resource/dog
>   
>     <Sebastian>
>     [17]http://wiktionary.dbpedia.org/page/dog-English-Verb-2fr
>   
>       [17]http://wiktionary.dbpedia.org/page/dog-English-Verb-2fr
>   
>     Sebastian: is an issue since repositories such as wiktionary as
>     like knowledge bases
>   
>     <raphael> Sebastian, there is a relationship between
>     [18]http://dbpedia.org/resource/Dog  and
>     [19]http://wiktionary.dbpedia.org/resource/dog  ?
>   
>       [18]http://dbpedia.org/resource/Dog
>       [19]http://wiktionary.dbpedia.org/resource/dog
>   
>     tadej: disambiguation lets you specify that type - entity or
>     word
>     ... as there are difference between inserting terminology link
>     and entity link
>   
>     <Zakim> raphael, you wanted to ask what is the added value of
>     using the nerd type as value of the typeof attribute (in RDFa)
>     over the native type provided by an extractor
>   
>     raphael: rdfa example query about different vocab are being
>     used
>     ... and which tool generated it
>   
>     tadej: handled by separate data category
>   
>     <fsasaki> see that data category, textanalysis annotation,
>     listed here
>     [20]http://www.w3.org/International/multilingualweb/lt/wiki/Imp
>     lementation_Commitments#Data_categories_2
>   
>       [20]http://www.w3.org/International/multilingualweb/lt/wiki/Implementation_Commitments#Data_categories_2
>   
>     <pablomendes> rephrase attempt: what are the relationships
>     between LexicalEntry instances from Wiktionary and entities in
>     DBpedia?
>   
>     <pablomendes> perhaps lexvo:lexicalization?
>   
>     <pablomendes>
>     [21]http://www.lexvo.org/page/term/eng/lexicalization
>   
>       [21]http://www.lexvo.org/page/term/eng/lexicalization
>   
>     <Sebastian>
>     [22]http://wiktionary.dbpedia.org/page/dog-English-Noun-1en
>   
>       [22]http://wiktionary.dbpedia.org/page/dog-English-Noun-1en
>   
>     Sebastien: responding to raphael, that in wiktionary the
>     separation between lexical entry and concept is not alway
>     clearly defined
>     ... might be a tool artefact that cause confusion
>   
>     <raphael> [23]http://dbpedia.org/page/dog  does not resolve BUT
>     [24]http://dbpedia.org/page/Dog  does !
>   
>       [23]http://dbpedia.org/page/dog
>       [24]http://dbpedia.org/page/Dog
>   
>     Yves: title is not consistent - entity or name entity
>   
>     tadej: an error, will fix this
>   
>     Yves: example should be an entity pointer
>   
>     <fsasaki> FYI, "pointer" attribute means: pointing to existing
>     information in the document
>   
>     tadej: will fix this
>   
>     raphael: what is relation between its draft and NIF
>   
>     tadej: there is some overlap and will be addressed in future,
>     perhaps as a separate part or document
>   
>     <fsasaki> [25]http://www.w3.org/TR/2012/WD-its20-20120626/
>   
>       [25]http://www.w3.org/TR/2012/WD-its20-20120626/
>   
>     Sebastian: we are considering document some roundtrip scenario
>     between ITS and NIF
>   
>     fsasaki: some initial work undertaken
>   
>     <pablomendes> wrt. its:disambigType = (word | entity) can't the
>     distinction between word and entity be inferred from
>     entityTypeRef? e.g. wiktionary:doc is a word, dbpedia:Dog is an
>     entity
>   
>     pablo: is disambig type redundant with entity type ref?
>   
>     <Sebastian> no, it's meta-meta
>   
>     tadej: this is possible, but unlcear how to maintain this
>     mapping and how users can infer this this
>   
>     <Sebastian> I really hav to learn how this speaker queue things
>     work, where can I RTFM ?
>   
>     tadej: disambiguation use cases are often used in cases where
>     text is short and lacks context
>     ... and computational lingusitic community draw a clear
>     distinction ebtween lexical and conceptual meaning
>   
>     <Zakim> raphael, you wanted to ask what are the implementations
>     of "Internationalization Tag Set (ITS) Version 1.0" and the
>     main diff between 1.0 and 2.0 ?
>   
>     <fsasaki>
>     [26]http://www.w3.org/International/multilingualweb/lt/drafts/i
>     ts20/its20.html#relation-to-its10
>   
>       [26]http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#relation-to-its10
>   
>     fsasaki: there is a seciton describing difference, mainly html5
>     coverage and new daa categories
>     ... its1.0 focussed more on classic i18n and l10n, its2.0 bring
>     in more language technology integration
>   
>     raphael; which tools implement its1.0
>   
>     <fsasaki> okapi [27]http://okapi.sourceforge.net/
>   
>       [27]http://okapi.sourceforge.net/
>   
>     yves: there is rainbow in okapi framework as well as
>     translation tools such as trados
>   
>     <fsasaki> trados translation tool
>   
>     <fsasaki> [28]http://okapi.opentag.com/
>   
>       [28]http://okapi.opentag.com/
>   
>     <Yves_>
>     [29]http://www.opentag.com/okapi/wiki/index.php?title=ITS
>   
>       [29]http://www.opentag.com/okapi/wiki/index.php?title=ITS
>   
>     fsasaki: thanks everyone
>   
>     <pablomendes> thank you
>   
>     <scribe> ACTION: tadej to integrate this feedback [recorded in
>     [30]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action01]
>   
>     <trackbot> Created ACTION-181 - Integrate this feedback [on
>     Tadej Štajner - due 2012-08-02].
>   
>     <raphael>
>     [31]http://www.w3.org/TR/2012/WD-its2req-20120524/#Automatic_en
>     richment_of_the_source_content_with_named_entity_annotations
>   
>       [31]http://www.w3.org/TR/2012/WD-its2req-20120524/#Automatic_enrichment_of_the_source_content_with_named_entity_annotations
>   
>     fsasaki: aim to finalise entity related meta-data.
>   
>     <Yves_> another real-life implementation:
>     [32]http://itstool.org/
>   
>       [32]http://itstool.org/
>   
>     fsasaki: this link has lots of other requirements, but we aim
>     to keep things simple as possible to hit w3c timescale
>     including november feature freeze
>   
>     <raphael> thanks all
>   
> its 20 draft publication
>   
>     <fsasaki>
>     [33]http://www.w3.org/International/multilingualweb/lt/drafts/i
>     ts20/its20.html
>   
>       [33]http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html
>   
>     fsasaki: any objections to publication - there are none
>   
>     <scribe> ACTION: fsasaki to publish update to working draft
>     next week [recorded in
>     [34]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action02]
>   
>     <trackbot> Created ACTION-182 - Publish update to working draft
>     next week [on Felix Sasaki - due 2012-08-02].
>   
>     fsasaki: will plan another draft in latter half of august
>   
> implementation committments
>   
>     <fsasaki>
>     [35]http://www.w3.org/International/multilingualweb/lt/wiki/Imp
>     lementation_Commitments
>   
>       [35]http://www.w3.org/International/multilingualweb/lt/wiki/Implementation_Commitments
>   
>     fsaski: please keep implementation commmittments table uptodate
>   
>     yves: will try very hard to implement disambiguation and named
>     entity data category
>   
>     <fsasaki>
>     [36]http://www.w3.org/International/multilingualweb/lt/drafts/i
>     ts20/its20.odd
>   
>       [36]http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.odd
>   
>     fsasaki: tells tadej to continue working on word version and
>     then when finished editors will integrate into ITS draft doc
>   
> call for consensus
>   
>     fsasaki: any comment on parameter for rule and target points,
>     are there further comments
>     ... no disagreement
>   
>     <fsasaki> scribe: fsasaki
>   
>     yves: domain, inheritance discussion
>     ... the discussion with declan
>     ... what is the outcome?
>   
>     dave: it was about the fact that in practice with statistical
>     MT
>     ... some domains will be more important than others
>     ... I was saying that the rules precedence is different than
>     the domain precendence declan was talking about
>     ... in statistical MT, you have sometimes domain precedence
>   
>     yves: my question was about the domain precedence attribute
>   
>     felix: yves is asking about the impliciation for implementing
>     domain
>   
>     dave: do we want to put this as a new optional attribute, e.g.
>     these are the ones that represent the primary domain
>     ... it is another optional attribute, need to get declan's
>     feedback what's best
>     ... in practice it will not be a definite instruction, not on
>     the side of MT providers
>     ... it is a hint, not a mechanical choice
>   
>     yves: looking at the example of domain rules
>     ... usage a) and b)
>     ... you have a domain precedence attribute and criminal law and
>     medical
>     ... you have a domain poitner that says where to get the
>     information
>     ... the precedence is in the rule
>     ... but how do you know which value to use
>     ... it is not listed in the domain
>     ... so what is the relationship
>     ... do we need a domain precedence pointer
>   
>     dave: not sure we really need it
>     ... need feedback from declan
>     ... a separate MT provider may do other decisions
>     ... a company like adobe might use it
>     ... section doing the content, versus the one for MT training
>     ... looked whether it is actually needed - it's a borderline
>     use case
>   
>     yves: seems to be a border case
>   
>     dave: I agree
>   
>     <scribe> ACTION: dave6 to contact declan and thomas about
>     domain new attribute proposal [recorded in
>     [37]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action03]
>   
>     <trackbot> Sorry, couldn't find user - dave6
>   
>     <scribe> ACTION: felix to integrate parameters for rules and
>     target pointer into the spec [recorded in
>     [38]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action04]
>   
>     <trackbot> Created ACTION-183 - Integrate parameters for rules
>     and target pointer into the spec [on Felix Sasaki - due
>     2012-08-02].
>   
> aob
>   
>     dave: we are adding more content explaining things
>     ... we end up putting in descriptions of standoff markup that
>     we are pointing to
>     ... are we happy to put that in the document?
>   
>     <daveL> scribe: daveL
>   
>     fsasaki: responding to query on non normative standoff markup
>     exmaples - we should definitely collect this and then decide
>     how best ot present this
>   
>     <fsasaki> [39]http://www.w3.org/TR/xml-i18n-bp/
>   
>       [39]http://www.w3.org/TR/xml-i18n-bp/
>   
>     fsasaki: next year we have more opportunity for separate best
>     practice best practices as in ITS1.0
>     ... and can take other things into account, such as use of meta
>     data in more complex workflow scneairo
>   
>     <Zakim> omstefanov, you wanted to discuss materials above and
>     in additions to ITS 2.0
>   
>     omstefanov: many other fields include commentaries on normative
>     or legal content, including exmaples and implementation
>     quesitons etc
>   
>     <fsasaki> ACTION: felix to prepare a place for BP material
>     [recorded in
>     [40]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action05]
>   
>     <trackbot> Created ACTION-184 - Prepare a place for BP material
>     [on Felix Sasaki - due 2012-08-02].
>   
>     <Zakim> Des, you wanted to discuss agenda for prague f2f
>   
>     <Jirka> Logistics page is at
>     [41]http://www.w3.org/International/multilingualweb/lt/wiki/Pra
>     gueSep2012
>   
>       [41]http://www.w3.org/International/multilingualweb/lt/wiki/PragueSep2012
>   
>     <fsasaki> ACTION: felix to prepare agenda draft for prague
>     [recorded in
>     [42]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action06]
>   
>     <trackbot> Created ACTION-185 - Prepare agenda draft for prague
>     [on Felix Sasaki - due 2012-08-02].
>   
>     <omstefanov> Nevertheless, I want to make one more pitch to use
>     the more globally understood term, "Commentaries on ..." rather
>     than "Best Practices" which, even if what you says applies,
>     Felix, usually is understood in a more restricted sense.
>   
> Summary of Action Items
>   
>     [NEW] ACTION: dave6 to contact declan and thomas about domain
>     new attribute proposal [recorded in
>     [43]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action03]
>     [NEW] ACTION: felix to integrate parameters for rules and
>     target pointer into the spec [recorded in
>     [44]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action04]
>     [NEW] ACTION: felix to prepare a place for BP material
>     [recorded in
>     [45]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action05]
>     [NEW] ACTION: felix to prepare agenda draft for prague
>     [recorded in
>     [46]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action06]
>     [NEW] ACTION: fsasaki to publish update to working draft next
>     week [recorded in
>     [47]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action02]
>     [NEW] ACTION: tadej to integrate this feedback [recorded in
>     [48]http://www.w3.org/2012/07/26-mlw-lt-minutes.html#action01]
>   
>     [End of minutes]
>       __________________________________________________________
>   
>   
>      Minutes formatted by David Booth's [49]scribe.perl version
>      1.136 ([50]CVS log)
>      $Date: 2012/07/26 16:44:03 $
>   
>       [49]http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm  <http://dev.w3.org/cvsweb/%7Echeckout%7E/2002/scribe/scribedoc.htm>
>       [50]http://dev.w3.org/cvsweb/2002/scribe/
>
Received on Friday, 27 July 2012 12:33:55 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:31:47 UTC