- From: Tadej Štajner <tadej.stajner@ijs.si>
- Date: Mon, 20 Aug 2012 12:11:38 +0200
- To: "Pablo N. Mendes" <pablomendes@gmail.com>
- CC: Felix Sasaki <fsasaki@w3.org>, Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>, "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>, "raphael.troncy@eurecom.fr" <raphael.troncy@eurecom.fr>, "Giuseppe.Rizzo@eurecom.fr" <Giuseppe.Rizzo@eurecom.fr>
- Message-ID: <50320D5A.3080700@ijs.si>
Hi, Pablo, correct. The feedback I got was that this distinction is very important, but I can't think of an example with the scenario you mention. Perhaps for spans where one is contained within the other, such as assigning a lexical meaning to a word, while the whole phrase is an entity, for example 'agriculture' in 'Ministry of agriculture'. I think it boils down to this: could this property be reliably inferred from the target itself? For instance, if someone points to http://www.w3.org/2006/03/wn/wn20/instances/worsense-capital-noun-3 - can we expect that is definitely a case of lexical disambiguation? -- Tadej On 20. 08. 2012 11:42, Pablo N. Mendes wrote: > Hi all, > > I would suggest to merge "its-entity-type-ident-ref" into > "its-disambig-type-ref". > > > If I understand correctly this is the same proposal I made at the call? > > "<pablomendes> wrt. its:disambigType = (word | entity) can't the > distinction between word and entity be inferred from entityTypeRef? > e.g. wiktionary:doc is a word, dbpedia:Dog is an entity" [1] > > If so, this is the answer that Tadej gave: > > "tadej: disambiguation use cases are often used in cases where text is > short and lacks context > ... and computational lingusitic community draw a clear distinction > ebtween lexical and conceptual meaning" [1] > > Perhaps one way to test how strong is this requirement would be to > think of use cases where one could assign both lexical and conceptual > meaning to the same span. > > Cheers, > Pablo > > [1] http://www.w3.org/2012/07/26-mlw-lt-minutes.html > > > On Mon, Aug 20, 2012 at 11:13 AM, Felix Sasaki <fsasaki@w3.org > <mailto:fsasaki@w3.org>> wrote: > > Hi Sebastian, > > 2012/8/20 Sebastian Hellmann <hellmann@informatik.uni-leipzig.de > <mailto:hellmann@informatik.uni-leipzig.de>> > > Hi Felix, > your proposal is based on the assumption, that more data is > available at these three URLs: > > http:/nerd.eurecom.fr/ontology#Place > <http://nerd.eurecom.fr/ontology#Place> > http://dbpedia.org/resource/Dublin > http://www.w3.org/2006/03/wn/wn20/instances/worsense-capital-noun-3 > > While this assumption is ok for the Semantic Web, I am not > sure about the ITS world. > > > > You are right that in the "ITS world" one cannot be sure that more > data is available. But I would argue that implementors who process > links also in the ITS world very likely need to know (not > automatically, but as a prerequisite for implementation ) what the > URL is about. So I'd rather encourage implementors towards that > "Semantic Web like" approach than defining so many attributes. > > Feedback from the people who want to process "disambiguation" > without Semantic Web processing is of course very important here. > > > Furthermore, if you are attempting to minimize it, I would > suggest to merge > "its-entity-type-ident-ref" into "its-disambig-type-ref". You > wouldn't be limited to entity types and could use any of: > > > > Makes sense to me, thanks for the proposal - let's see what Tadej > and others say. > > Best, > > Felix > > > - http:/nerd.eurecom.fr/ontology#Place > <http://nerd.eurecom.fr/ontology#Place> > - http://dbpedia.org/ontology/Place > - http://www.monnet-project.eu/lemon#LexicalSense > - http://www.monnet-project.eu/lemon#LexicalEntry > - http://wordnet.princeton.edu/wndatamodel#NounWordSense > - http://wordnet.princeton.edu/wndatamodel#Synset > > All the best, > Sebastian > > Am 20.08.2012 09:44, schrieb Felix Sasaki: > > Hi Sebastian, all, > > thanks, Sebastian. From what you say in the wiki and in > the previous mail, > I think one could simplify things a lot. > > The HTML example from Tadej *could* look like this: > > <html lang="en"> > > <head> > > <meta charset="utf-8" /> > > <title>Entity: Local Test</title> > > </head> > > <body> > > <p><span > > its-entity-type-ident-ref="http:/nerd.eurecom.fr/ontology#Place > <http://nerd.eurecom.fr/ontology#Place>" > > its-disambig-ident-ref="http://dbpedia.org/resource/Dublin">Dublin</span> > is the <span > > its-disambig-ident-ref=" > http://www.w3.org/2006/03/wn/wn20/instances/worsense-capital-noun-3">capital</span> > of Ireland.</p> > > </body> > > </html> > > That is, no explicit "resource" references for entity type and > disambiguation source, and no disambig-type. > > Also, I think one could get rid of adding this kind of > information via > global rules - I really don't see a use case for that. > > Tadej, others, thoughts? Maybe Yves as one of the > implementors processing > the output and other have some thoughts too? > > Best, > > Felix > > 2012/8/17 Sebastian Hellmann > <hellmann@informatik.uni-leipzig.de > <mailto:hellmann@informatik.uni-leipzig.de>> > > Dear Felix, > to solve this issue I prepared a page: > http://wiki.nlp2rdf.org/wiki/**DBpedia_Spotlight<http://wiki.nlp2rdf.org/wiki/DBpedia_Spotlight> > > > It is a rough draft, so there are many mistakes, > still. Once it is mature, > I will send it to the DBpedia Spotlight and Apache > Stanbol lists to get > their feedback. > Note that I don't have a problem with these properties > as XML attributes, > where they can naturally occur only once and encoding > an implicit > dependency (attribute refering to another attribute) > is unproblematic. They > are, however, difficult to handle in RDF, even when > declaring them > functional. > I will report back, if there are any news, > > All the best, > Sebastian > > > > > Am 14.08.2012 21:34, schrieb Felix Sasaki: > > Hi Sebastian, all, > > August is taking its tribute ... I am wondering if > there any thoughts on > Sebastian's mail below. It seems that some of the > proposed ITS attributes > are not needed, but I don't have the competence to > evaluate this. Thoughts > from others? Sebastian, could you confirm that > the output mentioned in > this other thread > > http://lists.w3.org/Archives/**Public/public-multilingualweb-** > lt/2012Aug/0168.html<http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0168.html> > > > > is correct for NIF? I then would create a test > case for our test suite, > see > > http://lists.w3.org/Archives/**Public/public-multilingualweb-** > lt-tests/2012Aug/0003.html<http://lists.w3.org/Archives/Public/public-multilingualweb-lt-tests/2012Aug/0003.html> > > > > Thanks, > > Felix > > Am Donnerstag, 9. August 2012 schrieb Sebastian > Hellmann : > > Hi Felix, > > below mostly my opinion on this. Nothing, > wrong with including these > properties, but they might not make sense in > RDF. If you think, that > there > are people who would really use these > properties in RDF, then go ahead > and > include them. Personally, *I* wouldn't know > for what *I* could use them. > More comments inline. > > Am 09.08.2012 15 <tel:09.08.2012%2015>:20, > schrieb Felix Sasaki: > > its:entityTypeSourceRef > > I really do not find this property helpful. > > Do you see any sense in saying that > http://dbpedia.org/resource/**** > Dublin > <http://dbpedia.org/resource/**Dublin><http://dbpedia.org/** > resource/Dublin > <http://dbpedia.org/resource/Dublin>>is from > > > http://dbpedia.org ? In the linked data world > http://dbpedia.org/resource/ > **Dublin > <http://dbpedia.org/resource/**Dublin<http://dbpedia.org/resource/Dublin>> > comes from > http://dbpedia.org/resource/****Dublin<http://dbpedia.org/resource/**Dublin>< > > > http://dbpedia.org/resource/**Dublin<http://dbpedia.org/resource/Dublin>>. > So you might specify a way to convert that to > ITS, but we might not need > > an RDF property for this. > > its:disambigType > > "(http://www.w3.org/2005/11/****its/lexicalConcept| > <http://www.w3.org/2005/11/****its/lexicalConcept%7C><http://www.w3.org/2005/11/**its/lexicalConcept%7C> > <http://**www.w3.org/2005/11/its/**lexicalConcept%7C > <http://www.w3.org/2005/11/its/**lexicalConcept%7C><http://www.w3.org/2005/11/its/lexicalConcept%7C> > http://www.w3.org/2005/11/its/****ontologyConcept|http://www.**w3.** > <http://www.w3.org/2005/11/its/****ontologyConcept%7Chttp://www.**w3.**><http://www.w3.org/2005/11/its/**ontologyConcept%7Chttp://www.w3.**> > org/2005/11/its/<http://www.**w3.org/2005/11/its/** > <http://w3.org/2005/11/its/**> > ontologyConcept%7Chttp://www.**w3.org/2005/11/its/ > <http://w3.org/2005/11/its/><http://www.w3.org/2005/11/its/ontologyConcept%7Chttp://www.w3.org/2005/11/its/> > > > entity)" > > I am unsure about this one. > > its:entityTypeRef > is already rdf:type, so it would be a > duplicate to have its:entityTypeRef > in RDF. For > http://dbpedia.org/resource/****Dublin<http://dbpedia.org/resource/**Dublin> > <http://dbpedia.org/**resource/Dublin<http://dbpedia.org/resource/Dublin> > > its:**entityTypeRef would be one of: > > http://dbpedia.org/ontology/****PopulatedPlace<http://dbpedia.org/ontology/**PopulatedPlace> > <http://dbpedia.**org/ontology/PopulatedPlace<http://dbpedia.org/ontology/PopulatedPlace> > http://dbpedia.org/ontology/****Settlement<http://dbpedia.org/ontology/**Settlement> > <http://dbpedia.org/**ontology/Settlement<http://dbpedia.org/ontology/Settlement> > http://umbel.org/umbel/rc/****PopulatedPlace<http://umbel.org/umbel/rc/**PopulatedPlace> > <http://umbel.**org/umbel/rc/PopulatedPlace<http://umbel.org/umbel/rc/PopulatedPlace> > http://dbpedia.org/ontology/****Place<http://dbpedia.org/ontology/**Place>< > http://dbpedia.org/ontology/**Place > <http://dbpedia.org/ontology/Place>> > http://umbel.org/umbel/rc/****Village<http://umbel.org/umbel/rc/**Village>< > http://umbel.org/umbel/rc/**Village > <http://umbel.org/umbel/rc/Village>> > http://umbel.org/umbel/rc/****Location_Underspecified<http://umbel.org/umbel/rc/**Location_Underspecified> > <http:/**/umbel.org/umbel/rc/Location_**Underspecified > <http://umbel.org/umbel/rc/Location_**Underspecified><http://umbel.org/umbel/rc/Location_Underspecified> > http://schema.org/Place > http://www.w3.org/2002/07/owl#****Thing<http://www.w3.org/2002/07/owl#**Thing> > <http://www.w3.org/**2002/07/owl#Thing<http://www.w3.org/2002/07/owl#Thing> > http://www.opengis.net/gml/_****Feature<http://www.opengis.net/gml/_**Feature> > <http://www.opengis.**net/gml/_Feature<http://www.opengis.net/gml/_Feature> > + > http:/nerd.eurecom.fr/****ontology#Place > <http://nerd.eurecom.fr/****ontology#Place><http://nerd.eurecom.fr/**ontology#Place> > <http://nerd.**eurecom.fr/ontology#Place > <http://eurecom.fr/ontology#Place><http://nerd.eurecom.fr/ontology#Place> > > > > If you have a Problem with this plurality. > Then it might be good to > include an annotation property > its:preferedEntityTypeRef > So the data is there already in RDF, the > problem is rather to find a way > to convert it back to ITS. > > All the best, > Sebastian > > > > Thanks, > > > Felix > > 2012/8/9 Felix Sasaki <fsasaki@w3.org > <mailto:fsasaki@w3.org>> > > Thanks for this, Tadej, looks good. There > is just one comment I don't > see > reflected: > > 7) A question on the data category in general > and the "rules" element: > does it make sense to make some attributes > mandatory? Currently, this > would > be valid: > <its:disambiguation > selector="/text/body/p[@id='****dublin']/> > > > > > It seems that still all metadata items / > attributes are optional. Is > there > a way to be more specific about what must or > must not appear together, > what > is optional etc? > > Best, > > Felix > > 2012/8/9 Tadej Stajner <tadej.stajner@ijs.si > <mailto:tadej.stajner@ijs.si>> > > Hi, > thanks for the tips. I covered them, and I > agree towards removing the > local XPath, since it has very limited use. > Here is another incorporating > all these comments. > -- Tadej > > On 8/3/2012 1:07 PM, Felix Sasaki wrote: > > Hi Tadej, all, > > thanks a lot for this. Just a few comments > / questions: > > 1) About "The information applies to the > textual content of the > element, including child elements and > attributes.": wouldn't it make more > sense to say that this applies to only the > content of the element? E.g. > if > you annotate the "span" element in > > <p>I have seen <span id="timbl"><span > class="firstame">Tim</span> > <span > class="lastname">Berners-Lee</****span></span> > in the olympics opening > > > ceremony</p> > > You want to express disambiguation > information about the "span" > element > with the "id" attribute, but not about the > "id" attribute or the nested > span elements. So inheritance probably should > be: "There is no > inheritance". What do you think? > > > 2) About "The Entity data category can be > expressed with global rules, > or locally on an individual element.": This > should probably be "The > Disambiguation data category can be expressed > with global rules, or > locally > on an individual element." > > 3) About local markup: for other data > categories, we don't have the > "pointer" attributes as local markup, since > processing of XPath in local > markup can be very expensive. So I would > propose to drop the local > pointer > attributes here too. > > 4) In the table at the end, "Global > pointing to existing information" > should be "yes" I think. > > 5) This selector > <its:disambiguation > selector="/text/body/p/#****dublin" ... > In XPath should be > <its:disambiguation > selector="/text/body/p[@id='****dublin'] > > > > 6) To follow the conventions from other > data categories, the > "its:disambiguation" element should probably > be called > "its:disambiguationRule". > > 7) A question on the data category in > general and the "rules" element: > does it make sense to make some attributes > mandatory? Currently, this > would > be valid: > <its:disambiguation > selector="/text/body/p[@id='****dublin']/> > > > > 8) A question to the others in this thread > (Guiseppe, Pablo, Raphael, > Sebastian): is this a representation that > makes sense to you and that > your > tools could produce? > > 9) A question to the MT guys: is the way > "entity and disambiguation" > information is represented here useful for you? > > Best, > > Felix > > 2012/8/3 Tadej Štajner <tadej.stajner@ijs.si > <mailto:tadej.stajner@ijs.si>> > > Hi, > I incorporated some comments that 'entity' was > still conflated from > several distinct things in the data category > proposal. Now, we > distinguish > between disambiguation of word sense, ontology > concept and entity > instance. > Following that, it seems that 'Disambiguation' > was the better name for > the > data category. > > Thanks for everyone's input! > > -- Tadej > > On 02. 08. 2012 17 > <tel:02.%2008.%202012%2017>:26, Tadej Štajner > wrote: > > Apologies -- wrong link on the previous > mail. This is the relevant one: > http://www.w3.org/****International/multilingualweb/** > **lt/track/actions/181<http://www.w3.org/**International/multilingualweb/**lt/track/actions/181> > <http://**www.w3.org/International/**multilingualweb/lt/track/** > <http://www.w3.org/International/**multilingualweb/lt/track/**> > > > actions/181<http://www.w3.org/International/multilingualweb/lt/track/actions/181> > -- Tadej > > On 02. 08. 2012 17 > <tel:02.%2008.%202012%2017>:22, Tadej Štajner > wrote: > > Dipl. Inf. Sebastian Hellmann > Department of Computer Science, University of > Leipzig > Events: > * > http://sabre2012.infai.org/****mlode<http://sabre2012.infai.org/**mlode>< > > > http://sabre2012.infai.org/**mlode > <http://sabre2012.infai.org/mlode>>(Leipzig, > Sept. 23-24-25, 2012) > > * http://wole2012.eurecom.fr (*Deadline: > July 31st 2012*) > Projects: http://nlp2rdf.org , http://dbpedia.org > Homepage: > http://bis.informatik.uni-**le**ipzig.de/SebastianHellmann > <http://ipzig.de/SebastianHellmann><http://leipzig.de/SebastianHellmann> > <htt**p://bis.informatik.uni-**leipzig.de/SebastianHellmann > <http://leipzig.de/SebastianHellmann><http://bis.informatik.uni-leipzig.de/SebastianHellmann> > Research Group: http://aksw.org > > > > -- > Dipl. Inf. Sebastian Hellmann > Department of Computer Science, University of Leipzig > Events: > * http://sabre2012.infai.org/**mlode > <http://sabre2012.infai.org/mlode>(Leipzig, Sept. > 23-24-25, 2012) > * http://wole2012.eurecom.fr (*Deadline: July 31st > 2012*) > Projects: http://nlp2rdf.org , http://dbpedia.org > Homepage: > http://bis.informatik.uni-**leipzig.de/SebastianHellmann > <http://leipzig.de/SebastianHellmann><http://bis.informatik.uni-leipzig.de/SebastianHellmann> > Research Group: http://aksw.org > > > > > > > -- > Dipl. Inf. Sebastian Hellmann > Department of Computer Science, University of Leipzig > Events: > * http://sabre2012.infai.org/mlode (Leipzig, Sept. 23-24-25, > 2012) > * http://wole2012.eurecom.fr (*Deadline: July 31st 2012*) > Projects: http://nlp2rdf.org , http://dbpedia.org > Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann > Research Group: http://aksw.org > > > > > -- > Felix Sasaki > DFKI / W3C Fellow > > > > > -- > --- > Pablo N. Mendes > http://pablomendes.com > Events: http://wole2012.eurecom.fr <http://wole2012.eurecom.fr/> >
Received on Monday, 20 August 2012 10:12:30 UTC