W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > August 2012

Re: [all] Call for consensus on disambiguation - feedback integrated [ACTION-181]

From: Tadej Štajner <tadej.stajner@ijs.si>
Date: Mon, 20 Aug 2012 12:11:38 +0200
Message-ID: <50320D5A.3080700@ijs.si>
To: "Pablo N. Mendes" <pablomendes@gmail.com>
CC: Felix Sasaki <fsasaki@w3.org>, Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>, "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>, "raphael.troncy@eurecom.fr" <raphael.troncy@eurecom.fr>, "Giuseppe.Rizzo@eurecom.fr" <Giuseppe.Rizzo@eurecom.fr>
Hi, Pablo,
correct. The feedback I got was that this distinction is very important, 
but I can't think of an example with the scenario you mention. Perhaps 
for spans where one is contained within the other, such as assigning a 
lexical meaning to a word, while the whole phrase is an entity, for 
example 'agriculture' in 'Ministry of agriculture'.

I think it boils down to this: could this property be reliably inferred 
from the target itself? For instance, if someone points to 
http://www.w3.org/2006/03/wn/wn20/instances/worsense-capital-noun-3 - 
can we expect that is definitely a case of lexical disambiguation?

-- Tadej


On 20. 08. 2012 11:42, Pablo N. Mendes wrote:
> Hi all,
>
>     I would suggest  to merge "its-entity-type-ident-ref" into
>     "its-disambig-type-ref".
>
>
> If I understand correctly this is the same proposal I made at the call?
>
> "<pablomendes> wrt. its:disambigType = (word | entity) can't the 
> distinction between word and entity be inferred from entityTypeRef? 
> e.g. wiktionary:doc is a word, dbpedia:Dog is an entity" [1]
>
> If so, this is the answer that Tadej gave:
>
> "tadej: disambiguation use cases are often used in cases where text is 
> short and lacks context
> ... and computational lingusitic community draw a clear distinction 
> ebtween lexical and conceptual meaning" [1]
>
> Perhaps one way to test how strong is this requirement would be to 
> think of use cases where one could assign both lexical and conceptual 
> meaning to the same span.
>
> Cheers,
> Pablo
>
> [1] http://www.w3.org/2012/07/26-mlw-lt-minutes.html
>
>
> On Mon, Aug 20, 2012 at 11:13 AM, Felix Sasaki <fsasaki@w3.org 
> <mailto:fsasaki@w3.org>> wrote:
>
>     Hi Sebastian,
>
>     2012/8/20 Sebastian Hellmann <hellmann@informatik.uni-leipzig.de
>     <mailto:hellmann@informatik.uni-leipzig.de>>
>
>         Hi Felix,
>         your proposal is based on the assumption, that more data is
>         available at these three URLs:
>
>         http:/nerd.eurecom.fr/ontology#Place
>         <http://nerd.eurecom.fr/ontology#Place>
>         http://dbpedia.org/resource/Dublin
>         http://www.w3.org/2006/03/wn/wn20/instances/worsense-capital-noun-3
>
>         While this assumption is ok for the Semantic Web, I am not
>         sure about the ITS world.
>
>
>
>     You are right that in the "ITS world" one cannot be sure that more
>     data is available. But I would argue that implementors who process
>     links also in the ITS world very likely need to know (not
>     automatically, but as a prerequisite for implementation ) what the
>     URL is about. So I'd rather encourage implementors towards that
>     "Semantic Web like" approach than defining so many attributes.
>
>     Feedback from the people who want to process "disambiguation"
>     without Semantic Web processing is of course very important here.
>
>
>         Furthermore, if you are attempting to minimize it, I would
>         suggest  to merge
>         "its-entity-type-ident-ref" into "its-disambig-type-ref". You
>         wouldn't be limited to entity types and could use any of:
>
>
>
>     Makes sense to me, thanks for the proposal - let's see what Tadej
>     and others say.
>
>     Best,
>
>     Felix
>
>
>         - http:/nerd.eurecom.fr/ontology#Place
>         <http://nerd.eurecom.fr/ontology#Place>
>         - http://dbpedia.org/ontology/Place
>         - http://www.monnet-project.eu/lemon#LexicalSense
>         - http://www.monnet-project.eu/lemon#LexicalEntry
>         - http://wordnet.princeton.edu/wndatamodel#NounWordSense
>         - http://wordnet.princeton.edu/wndatamodel#Synset
>
>         All the best,
>         Sebastian
>
>         Am 20.08.2012 09:44, schrieb Felix Sasaki:
>
>             Hi Sebastian, all,
>
>             thanks, Sebastian. From what you say in the wiki and in
>             the previous mail,
>             I think one could simplify things a lot.
>
>             The HTML example from Tadej *could* look like this:
>
>             <html lang="en">
>
>                 <head>
>
>                    <meta charset="utf-8" />
>
>                    <title>Entity: Local Test</title>
>
>                 </head>
>
>                 <body>
>
>                     <p><span
>
>             its-entity-type-ident-ref="http:/nerd.eurecom.fr/ontology#Place
>             <http://nerd.eurecom.fr/ontology#Place>"
>
>             its-disambig-ident-ref="http://dbpedia.org/resource/Dublin">Dublin</span>
>             is the <span
>
>             its-disambig-ident-ref="
>             http://www.w3.org/2006/03/wn/wn20/instances/worsense-capital-noun-3">capital</span>
>             of Ireland.</p>
>
>                 </body>
>
>             </html>
>
>             That is, no explicit "resource" references for entity type and
>             disambiguation source, and no disambig-type.
>
>             Also, I think one could get rid of adding this kind of
>             information via
>             global rules - I really don't see a use case for that.
>
>             Tadej, others, thoughts? Maybe Yves as one of the
>             implementors processing
>             the output and other have some thoughts too?
>
>             Best,
>
>             Felix
>
>             2012/8/17 Sebastian Hellmann
>             <hellmann@informatik.uni-leipzig.de
>             <mailto:hellmann@informatik.uni-leipzig.de>>
>
>                 Dear Felix,
>                 to solve this issue I prepared a page:
>                 http://wiki.nlp2rdf.org/wiki/**DBpedia_Spotlight<http://wiki.nlp2rdf.org/wiki/DBpedia_Spotlight>
>
>
>                 It is a rough draft, so there are many mistakes,
>                 still. Once it is mature,
>                 I will send it to the DBpedia Spotlight and Apache
>                 Stanbol lists to get
>                 their feedback.
>                 Note that I don't have a problem with these properties
>                 as XML attributes,
>                 where they can naturally occur only once and encoding
>                 an implicit
>                 dependency (attribute refering to another attribute)
>                 is unproblematic. They
>                 are, however, difficult to handle in RDF, even when
>                 declaring them
>                 functional.
>                 I will report back, if there are any news,
>
>                 All the best,
>                 Sebastian
>
>
>
>
>                 Am 14.08.2012 21:34, schrieb Felix Sasaki:
>
>                     Hi Sebastian, all,
>
>                     August is taking its tribute ... I am wondering if
>                     there any thoughts on
>                     Sebastian's mail below. It seems that some of the
>                     proposed ITS attributes
>                     are not needed, but I don't have the competence to
>                     evaluate this. Thoughts
>                     from others?  Sebastian, could you confirm that
>                     the output mentioned in
>                     this other thread
>
>                     http://lists.w3.org/Archives/**Public/public-multilingualweb-**
>                     lt/2012Aug/0168.html<http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0168.html>
>
>
>
>                     is correct for NIF? I then would create a test
>                     case for our test suite,
>                     see
>
>                     http://lists.w3.org/Archives/**Public/public-multilingualweb-**
>                     lt-tests/2012Aug/0003.html<http://lists.w3.org/Archives/Public/public-multilingualweb-lt-tests/2012Aug/0003.html>
>
>
>
>                     Thanks,
>
>                     Felix
>
>                     Am Donnerstag, 9. August 2012 schrieb Sebastian
>                     Hellmann :
>
>                       Hi Felix,
>
>                         below mostly my opinion on this. Nothing,
>                         wrong with including these
>                         properties, but they might not make sense in
>                         RDF. If you think, that
>                         there
>                         are people who would really use these
>                         properties in RDF, then go ahead
>                         and
>                         include them. Personally, *I* wouldn't know
>                         for what *I* could use them.
>                         More comments inline.
>
>                         Am 09.08.2012 15 <tel:09.08.2012%2015>:20,
>                         schrieb Felix Sasaki:
>
>                           its:entityTypeSourceRef
>
>                               I really do not find this property helpful.
>
>                         Do you see any sense in saying that
>                         http://dbpedia.org/resource/****
>                         Dublin
>                         <http://dbpedia.org/resource/**Dublin><http://dbpedia.org/**
>                         resource/Dublin
>                         <http://dbpedia.org/resource/Dublin>>is from
>
>
>                         http://dbpedia.org ? In the linked data world
>                         http://dbpedia.org/resource/
>                         **Dublin
>                         <http://dbpedia.org/resource/**Dublin<http://dbpedia.org/resource/Dublin>>
>                         comes from
>                         http://dbpedia.org/resource/****Dublin<http://dbpedia.org/resource/**Dublin><
>
>
>                         http://dbpedia.org/resource/**Dublin<http://dbpedia.org/resource/Dublin>>.
>                         So you might specify a way to convert that to
>                         ITS, but we might not need
>
>                         an RDF property for this.
>
>                            its:disambigType
>
>                             "(http://www.w3.org/2005/11/****its/lexicalConcept|
>                             <http://www.w3.org/2005/11/****its/lexicalConcept%7C><http://www.w3.org/2005/11/**its/lexicalConcept%7C>
>                             <http://**www.w3.org/2005/11/its/**lexicalConcept%7C
>                             <http://www.w3.org/2005/11/its/**lexicalConcept%7C><http://www.w3.org/2005/11/its/lexicalConcept%7C>
>                             http://www.w3.org/2005/11/its/****ontologyConcept|http://www.**w3.**
>                             <http://www.w3.org/2005/11/its/****ontologyConcept%7Chttp://www.**w3.**><http://www.w3.org/2005/11/its/**ontologyConcept%7Chttp://www.w3.**>
>                             org/2005/11/its/<http://www.**w3.org/2005/11/its/**
>                             <http://w3.org/2005/11/its/**>
>                             ontologyConcept%7Chttp://www.**w3.org/2005/11/its/
>                             <http://w3.org/2005/11/its/><http://www.w3.org/2005/11/its/ontologyConcept%7Chttp://www.w3.org/2005/11/its/>
>
>
>                             entity)"
>
>                               I am unsure about this one.
>
>                            its:entityTypeRef
>                         is already rdf:type, so it would be a
>                         duplicate to have its:entityTypeRef
>                         in RDF. For
>                         http://dbpedia.org/resource/****Dublin<http://dbpedia.org/resource/**Dublin>
>                         <http://dbpedia.org/**resource/Dublin<http://dbpedia.org/resource/Dublin>
>
>                             its:**entityTypeRef would be one of:
>
>                         http://dbpedia.org/ontology/****PopulatedPlace<http://dbpedia.org/ontology/**PopulatedPlace>
>                         <http://dbpedia.**org/ontology/PopulatedPlace<http://dbpedia.org/ontology/PopulatedPlace>
>                         http://dbpedia.org/ontology/****Settlement<http://dbpedia.org/ontology/**Settlement>
>                         <http://dbpedia.org/**ontology/Settlement<http://dbpedia.org/ontology/Settlement>
>                         http://umbel.org/umbel/rc/****PopulatedPlace<http://umbel.org/umbel/rc/**PopulatedPlace>
>                         <http://umbel.**org/umbel/rc/PopulatedPlace<http://umbel.org/umbel/rc/PopulatedPlace>
>                         http://dbpedia.org/ontology/****Place<http://dbpedia.org/ontology/**Place><
>                         http://dbpedia.org/ontology/**Place
>                         <http://dbpedia.org/ontology/Place>>
>                         http://umbel.org/umbel/rc/****Village<http://umbel.org/umbel/rc/**Village><
>                         http://umbel.org/umbel/rc/**Village
>                         <http://umbel.org/umbel/rc/Village>>
>                         http://umbel.org/umbel/rc/****Location_Underspecified<http://umbel.org/umbel/rc/**Location_Underspecified>
>                         <http:/**/umbel.org/umbel/rc/Location_**Underspecified
>                         <http://umbel.org/umbel/rc/Location_**Underspecified><http://umbel.org/umbel/rc/Location_Underspecified>
>                         http://schema.org/Place
>                         http://www.w3.org/2002/07/owl#****Thing<http://www.w3.org/2002/07/owl#**Thing>
>                         <http://www.w3.org/**2002/07/owl#Thing<http://www.w3.org/2002/07/owl#Thing>
>                         http://www.opengis.net/gml/_****Feature<http://www.opengis.net/gml/_**Feature>
>                         <http://www.opengis.**net/gml/_Feature<http://www.opengis.net/gml/_Feature>
>                         +
>                         http:/nerd.eurecom.fr/****ontology#Place
>                         <http://nerd.eurecom.fr/****ontology#Place><http://nerd.eurecom.fr/**ontology#Place>
>                         <http://nerd.**eurecom.fr/ontology#Place
>                         <http://eurecom.fr/ontology#Place><http://nerd.eurecom.fr/ontology#Place>
>
>
>
>                         If you have a Problem with this plurality.
>                         Then it might be good to
>                         include an annotation property
>                          its:preferedEntityTypeRef
>                         So the data is there already in RDF, the
>                         problem is rather to find a way
>                         to convert it back to ITS.
>
>                         All the best,
>                         Sebastian
>
>
>
>                         Thanks,
>
>
>                         Felix
>
>                         2012/8/9 Felix Sasaki <fsasaki@w3.org
>                         <mailto:fsasaki@w3.org>>
>
>                            Thanks for this, Tadej, looks good. There
>                         is just one comment I don't
>                         see
>                         reflected:
>
>                         7) A question on the data category in general
>                         and the "rules" element:
>                         does it make sense to make some attributes
>                         mandatory? Currently, this
>                         would
>                         be valid:
>                         <its:disambiguation
>                         selector="/text/body/p[@id='****dublin']/>
>
>
>
>
>                         It seems that still all metadata items /
>                         attributes are optional. Is
>                         there
>                         a way to be more specific about what must or
>                         must not appear together,
>                         what
>                         is optional etc?
>
>                         Best,
>
>                         Felix
>
>                         2012/8/9 Tadej Stajner <tadej.stajner@ijs.si
>                         <mailto:tadej.stajner@ijs.si>>
>
>                              Hi,
>                             thanks for the tips. I covered them, and I
>                         agree towards removing the
>                         local XPath, since it has very limited use.
>                         Here is another incorporating
>                         all these comments.
>                         -- Tadej
>
>                         On 8/3/2012 1:07 PM, Felix Sasaki wrote:
>
>                         Hi Tadej, all,
>
>                             thanks a lot for this. Just a few comments
>                         / questions:
>
>                             1) About "The information applies to the
>                         textual content of the
>                         element, including child elements and
>                         attributes.": wouldn't it make more
>                         sense to say that this applies to only the
>                         content of the element? E.g.
>                         if
>                         you annotate the "span" element in
>
>                             <p>I have seen <span id="timbl"><span
>                         class="firstame">Tim</span>
>                         <span
>                         class="lastname">Berners-Lee</****span></span>
>                         in the olympics opening
>
>
>                         ceremony</p>
>
>                             You want to express disambiguation
>                         information about the "span"
>                         element
>                         with the "id" attribute, but not about the
>                         "id" attribute or the nested
>                         span elements. So inheritance probably should
>                         be: "There is no
>                         inheritance". What do you think?
>
>
>                             2) About "The Entity data category can be
>                         expressed with global rules,
>                         or locally on an individual element.": This
>                         should probably be "The
>                         Disambiguation data category can be expressed
>                         with global rules, or
>                         locally
>                         on an individual element."
>
>                             3) About local markup: for other data
>                         categories, we don't have the
>                         "pointer" attributes as local markup, since
>                         processing of XPath in local
>                         markup can be very expensive. So I would
>                         propose to drop the local
>                         pointer
>                         attributes here too.
>
>                             4) In the table at the end, "Global
>                         pointing to existing information"
>                         should be "yes" I think.
>
>                             5) This selector
>                         <its:disambiguation
>                         selector="/text/body/p/#****dublin" ...
>                         In XPath should be
>                         <its:disambiguation
>                         selector="/text/body/p[@id='****dublin']
>
>
>
>                             6) To follow the conventions from other
>                         data categories, the
>                         "its:disambiguation" element should probably
>                         be called
>                         "its:disambiguationRule".
>
>                             7) A question on the data category in
>                         general and the "rules" element:
>                         does it make sense to make some attributes
>                         mandatory? Currently, this
>                         would
>                         be valid:
>                         <its:disambiguation
>                         selector="/text/body/p[@id='****dublin']/>
>
>
>
>                             8) A question to the others in this thread
>                         (Guiseppe, Pablo, Raphael,
>                         Sebastian): is this a representation that
>                         makes sense to you and that
>                         your
>                         tools could produce?
>
>                             9) A question to the MT guys: is the way
>                         "entity and disambiguation"
>                         information is represented here useful for you?
>
>                             Best,
>
>                             Felix
>
>                         2012/8/3 Tadej Štajner <tadej.stajner@ijs.si
>                         <mailto:tadej.stajner@ijs.si>>
>
>                            Hi,
>                         I incorporated some comments that 'entity' was
>                         still conflated from
>                         several distinct things in the data category
>                         proposal. Now, we
>                         distinguish
>                         between disambiguation of word sense, ontology
>                         concept and entity
>                         instance.
>                         Following that, it seems that 'Disambiguation'
>                         was the better name for
>                         the
>                         data category.
>
>                         Thanks for everyone's input!
>
>                         -- Tadej
>
>                         On 02. 08. 2012 17
>                         <tel:02.%2008.%202012%2017>:26, Tadej Štajner
>                         wrote:
>
>                            Apologies -- wrong link on the previous
>                         mail. This is the relevant one:
>                         http://www.w3.org/****International/multilingualweb/**
>                         **lt/track/actions/181<http://www.w3.org/**International/multilingualweb/**lt/track/actions/181>
>                         <http://**www.w3.org/International/**multilingualweb/lt/track/**
>                         <http://www.w3.org/International/**multilingualweb/lt/track/**>
>
>
>                         actions/181<http://www.w3.org/International/multilingualweb/lt/track/actions/181>
>                         -- Tadej
>
>                         On 02. 08. 2012 17
>                         <tel:02.%2008.%202012%2017>:22, Tadej Štajner
>                         wrote:
>
>                         Dipl. Inf. Sebastian Hellmann
>                         Department of Computer Science, University of
>                         Leipzig
>                         Events:
>                             *
>                         http://sabre2012.infai.org/****mlode<http://sabre2012.infai.org/**mlode><
>
>
>                         http://sabre2012.infai.org/**mlode
>                         <http://sabre2012.infai.org/mlode>>(Leipzig,
>                         Sept. 23-24-25, 2012)
>
>                             * http://wole2012.eurecom.fr (*Deadline:
>                         July 31st 2012*)
>                         Projects: http://nlp2rdf.org , http://dbpedia.org
>                         Homepage:
>                         http://bis.informatik.uni-**le**ipzig.de/SebastianHellmann
>                         <http://ipzig.de/SebastianHellmann><http://leipzig.de/SebastianHellmann>
>                         <htt**p://bis.informatik.uni-**leipzig.de/SebastianHellmann
>                         <http://leipzig.de/SebastianHellmann><http://bis.informatik.uni-leipzig.de/SebastianHellmann>
>                         Research Group: http://aksw.org
>
>
>
>                 --
>                 Dipl. Inf. Sebastian Hellmann
>                 Department of Computer Science, University of Leipzig
>                 Events:
>                    * http://sabre2012.infai.org/**mlode
>                 <http://sabre2012.infai.org/mlode>(Leipzig, Sept.
>                 23-24-25, 2012)
>                    * http://wole2012.eurecom.fr (*Deadline: July 31st
>                 2012*)
>                 Projects: http://nlp2rdf.org , http://dbpedia.org
>                 Homepage:
>                 http://bis.informatik.uni-**leipzig.de/SebastianHellmann
>                 <http://leipzig.de/SebastianHellmann><http://bis.informatik.uni-leipzig.de/SebastianHellmann>
>                 Research Group: http://aksw.org
>
>
>
>
>
>
>         -- 
>         Dipl. Inf. Sebastian Hellmann
>         Department of Computer Science, University of Leipzig
>         Events:
>           * http://sabre2012.infai.org/mlode (Leipzig, Sept. 23-24-25,
>         2012)
>           * http://wole2012.eurecom.fr (*Deadline: July 31st 2012*)
>         Projects: http://nlp2rdf.org , http://dbpedia.org
>         Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
>         Research Group: http://aksw.org
>
>
>
>
>     -- 
>     Felix Sasaki
>     DFKI / W3C Fellow
>
>
>
>
> -- 
> ---
> Pablo N. Mendes
> http://pablomendes.com
> Events: http://wole2012.eurecom.fr <http://wole2012.eurecom.fr/>
>
Received on Monday, 20 August 2012 10:12:30 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:31:50 UTC