- From: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
- Date: Mon, 20 Aug 2012 10:37:32 +0200
- To: Felix Sasaki <fsasaki@w3.org>
- CC: Tadej Stajner <tadej.stajner@ijs.si>, "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>, "raphael.troncy@eurecom.fr" <raphael.troncy@eurecom.fr>, "pablomendes@gmail.com" <pablomendes@gmail.com>, "Giuseppe.Rizzo@eurecom.fr" <Giuseppe.Rizzo@eurecom.fr>
Hi Felix, your proposal is based on the assumption, that more data is available at these three URLs: http:/nerd.eurecom.fr/ontology#Place http://dbpedia.org/resource/Dublin http://www.w3.org/2006/03/wn/wn20/instances/worsense-capital-noun-3 While this assumption is ok for the Semantic Web, I am not sure about the ITS world. Furthermore, if you are attempting to minimize it, I would suggest to merge "its-entity-type-ident-ref" into "its-disambig-type-ref". You wouldn't be limited to entity types and could use any of: - http:/nerd.eurecom.fr/ontology#Place - http://dbpedia.org/ontology/Place - http://www.monnet-project.eu/lemon#LexicalSense - http://www.monnet-project.eu/lemon#LexicalEntry - http://wordnet.princeton.edu/wndatamodel#NounWordSense - http://wordnet.princeton.edu/wndatamodel#Synset All the best, Sebastian Am 20.08.2012 09:44, schrieb Felix Sasaki: > Hi Sebastian, all, > > thanks, Sebastian. From what you say in the wiki and in the previous mail, > I think one could simplify things a lot. > > The HTML example from Tadej *could* look like this: > > <html lang="en"> > > <head> > > <meta charset="utf-8" /> > > <title>Entity: Local Test</title> > > </head> > > <body> > > <p><span > > its-entity-type-ident-ref="http:/nerd.eurecom.fr/ontology#Place" > > its-disambig-ident-ref="http://dbpedia.org/resource/Dublin">Dublin</span> > is the <span > > its-disambig-ident-ref=" > http://www.w3.org/2006/03/wn/wn20/instances/worsense-capital-noun-3">capital</span> > of Ireland.</p> > > </body> > > </html> > > That is, no explicit "resource" references for entity type and > disambiguation source, and no disambig-type. > > Also, I think one could get rid of adding this kind of information via > global rules - I really don't see a use case for that. > > Tadej, others, thoughts? Maybe Yves as one of the implementors processing > the output and other have some thoughts too? > > Best, > > Felix > > 2012/8/17 Sebastian Hellmann <hellmann@informatik.uni-leipzig.de> > >> Dear Felix, >> to solve this issue I prepared a page: >> http://wiki.nlp2rdf.org/wiki/**DBpedia_Spotlight<http://wiki.nlp2rdf.org/wiki/DBpedia_Spotlight> >> It is a rough draft, so there are many mistakes, still. Once it is mature, >> I will send it to the DBpedia Spotlight and Apache Stanbol lists to get >> their feedback. >> Note that I don't have a problem with these properties as XML attributes, >> where they can naturally occur only once and encoding an implicit >> dependency (attribute refering to another attribute) is unproblematic. They >> are, however, difficult to handle in RDF, even when declaring them >> functional. >> I will report back, if there are any news, >> >> All the best, >> Sebastian >> >> >> >> >> Am 14.08.2012 21:34, schrieb Felix Sasaki: >> >>> Hi Sebastian, all, >>> >>> August is taking its tribute ... I am wondering if there any thoughts on >>> Sebastian's mail below. It seems that some of the proposed ITS attributes >>> are not needed, but I don't have the competence to evaluate this. Thoughts >>> from others? Sebastian, could you confirm that the output mentioned in >>> this other thread >>> >>> http://lists.w3.org/Archives/**Public/public-multilingualweb-** >>> lt/2012Aug/0168.html<http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0168.html> >>> >>> is correct for NIF? I then would create a test case for our test suite, >>> see >>> >>> http://lists.w3.org/Archives/**Public/public-multilingualweb-** >>> lt-tests/2012Aug/0003.html<http://lists.w3.org/Archives/Public/public-multilingualweb-lt-tests/2012Aug/0003.html> >>> >>> Thanks, >>> >>> Felix >>> >>> Am Donnerstag, 9. August 2012 schrieb Sebastian Hellmann : >>> >>> Hi Felix, >>>> below mostly my opinion on this. Nothing, wrong with including these >>>> properties, but they might not make sense in RDF. If you think, that >>>> there >>>> are people who would really use these properties in RDF, then go ahead >>>> and >>>> include them. Personally, *I* wouldn't know for what *I* could use them. >>>> More comments inline. >>>> >>>> Am 09.08.2012 15:20, schrieb Felix Sasaki: >>>> >>>> its:entityTypeSourceRef >>>>> I really do not find this property helpful. >>>> Do you see any sense in saying that http://dbpedia.org/resource/**** >>>> Dublin <http://dbpedia.org/resource/**Dublin><http://dbpedia.org/** >>>> resource/Dublin <http://dbpedia.org/resource/Dublin>>is from >>>> >>>> http://dbpedia.org ? In the linked data world >>>> http://dbpedia.org/resource/ >>>> **Dublin <http://dbpedia.org/resource/**Dublin<http://dbpedia.org/resource/Dublin>> >>>> comes from >>>> http://dbpedia.org/resource/****Dublin<http://dbpedia.org/resource/**Dublin>< >>>> http://dbpedia.org/resource/**Dublin<http://dbpedia.org/resource/Dublin>>. >>>> So you might specify a way to convert that to ITS, but we might not need >>>> >>>> an RDF property for this. >>>> >>>> its:disambigType >>>> >>>>> "(http://www.w3.org/2005/11/****its/lexicalConcept|<http://www.w3.org/2005/11/**its/lexicalConcept%7C> >>>>> <http://**www.w3.org/2005/11/its/**lexicalConcept%7C<http://www.w3.org/2005/11/its/lexicalConcept%7C> >>>>> http://www.w3.org/2005/11/its/****ontologyConcept|http://www.**w3.**<http://www.w3.org/2005/11/its/**ontologyConcept%7Chttp://www.w3.**> >>>>> org/2005/11/its/<http://www.**w3.org/2005/11/its/** >>>>> ontologyConcept%7Chttp://www.**w3.org/2005/11/its/<http://www.w3.org/2005/11/its/ontologyConcept%7Chttp://www.w3.org/2005/11/its/> >>>>> entity)" >>>>> >>>>> I am unsure about this one. >>>> its:entityTypeRef >>>> is already rdf:type, so it would be a duplicate to have its:entityTypeRef >>>> in RDF. For http://dbpedia.org/resource/****Dublin<http://dbpedia.org/resource/**Dublin> >>>> <http://dbpedia.org/**resource/Dublin<http://dbpedia.org/resource/Dublin> >>>>> its:**entityTypeRef would be one of: >>>> http://dbpedia.org/ontology/****PopulatedPlace<http://dbpedia.org/ontology/**PopulatedPlace> >>>> <http://dbpedia.**org/ontology/PopulatedPlace<http://dbpedia.org/ontology/PopulatedPlace> >>>> http://dbpedia.org/ontology/****Settlement<http://dbpedia.org/ontology/**Settlement> >>>> <http://dbpedia.org/**ontology/Settlement<http://dbpedia.org/ontology/Settlement> >>>> http://umbel.org/umbel/rc/****PopulatedPlace<http://umbel.org/umbel/rc/**PopulatedPlace> >>>> <http://umbel.**org/umbel/rc/PopulatedPlace<http://umbel.org/umbel/rc/PopulatedPlace> >>>> http://dbpedia.org/ontology/****Place<http://dbpedia.org/ontology/**Place>< >>>> http://dbpedia.org/ontology/**Place <http://dbpedia.org/ontology/Place>> >>>> http://umbel.org/umbel/rc/****Village<http://umbel.org/umbel/rc/**Village>< >>>> http://umbel.org/umbel/rc/**Village <http://umbel.org/umbel/rc/Village>> >>>> http://umbel.org/umbel/rc/****Location_Underspecified<http://umbel.org/umbel/rc/**Location_Underspecified> >>>> <http:/**/umbel.org/umbel/rc/Location_**Underspecified<http://umbel.org/umbel/rc/Location_Underspecified> >>>> http://schema.org/Place >>>> http://www.w3.org/2002/07/owl#****Thing<http://www.w3.org/2002/07/owl#**Thing> >>>> <http://www.w3.org/**2002/07/owl#Thing<http://www.w3.org/2002/07/owl#Thing> >>>> http://www.opengis.net/gml/_****Feature<http://www.opengis.net/gml/_**Feature> >>>> <http://www.opengis.**net/gml/_Feature<http://www.opengis.net/gml/_Feature> >>>> + >>>> http:/nerd.eurecom.fr/****ontology#Place<http://nerd.eurecom.fr/**ontology#Place> >>>> <http://nerd.**eurecom.fr/ontology#Place<http://nerd.eurecom.fr/ontology#Place> >>>> >>>> If you have a Problem with this plurality. Then it might be good to >>>> include an annotation property its:preferedEntityTypeRef >>>> So the data is there already in RDF, the problem is rather to find a way >>>> to convert it back to ITS. >>>> >>>> All the best, >>>> Sebastian >>>> >>>> >>>> >>>> Thanks, >>>> >>>> >>>> Felix >>>> >>>> 2012/8/9 Felix Sasaki <fsasaki@w3.org> >>>> >>>> Thanks for this, Tadej, looks good. There is just one comment I don't >>>> see >>>> reflected: >>>> >>>> 7) A question on the data category in general and the "rules" element: >>>> does it make sense to make some attributes mandatory? Currently, this >>>> would >>>> be valid: >>>> <its:disambiguation selector="/text/body/p[@id='****dublin']/> >>>> >>>> >>>> >>>> It seems that still all metadata items / attributes are optional. Is >>>> there >>>> a way to be more specific about what must or must not appear together, >>>> what >>>> is optional etc? >>>> >>>> Best, >>>> >>>> Felix >>>> >>>> 2012/8/9 Tadej Stajner <tadej.stajner@ijs.si> >>>> >>>> Hi, >>>> thanks for the tips. I covered them, and I agree towards removing the >>>> local XPath, since it has very limited use. Here is another incorporating >>>> all these comments. >>>> -- Tadej >>>> >>>> On 8/3/2012 1:07 PM, Felix Sasaki wrote: >>>> >>>> Hi Tadej, all, >>>> >>>> thanks a lot for this. Just a few comments / questions: >>>> >>>> 1) About "The information applies to the textual content of the >>>> element, including child elements and attributes.": wouldn't it make more >>>> sense to say that this applies to only the content of the element? E.g. >>>> if >>>> you annotate the "span" element in >>>> >>>> <p>I have seen <span id="timbl"><span class="firstame">Tim</span> >>>> <span >>>> class="lastname">Berners-Lee</****span></span> in the olympics opening >>>> >>>> ceremony</p> >>>> >>>> You want to express disambiguation information about the "span" >>>> element >>>> with the "id" attribute, but not about the "id" attribute or the nested >>>> span elements. So inheritance probably should be: "There is no >>>> inheritance". What do you think? >>>> >>>> >>>> 2) About "The Entity data category can be expressed with global rules, >>>> or locally on an individual element.": This should probably be "The >>>> Disambiguation data category can be expressed with global rules, or >>>> locally >>>> on an individual element." >>>> >>>> 3) About local markup: for other data categories, we don't have the >>>> "pointer" attributes as local markup, since processing of XPath in local >>>> markup can be very expensive. So I would propose to drop the local >>>> pointer >>>> attributes here too. >>>> >>>> 4) In the table at the end, "Global pointing to existing information" >>>> should be "yes" I think. >>>> >>>> 5) This selector >>>> <its:disambiguation selector="/text/body/p/#****dublin" ... >>>> In XPath should be >>>> <its:disambiguation selector="/text/body/p[@id='****dublin'] >>>> >>>> >>>> 6) To follow the conventions from other data categories, the >>>> "its:disambiguation" element should probably be called >>>> "its:disambiguationRule". >>>> >>>> 7) A question on the data category in general and the "rules" element: >>>> does it make sense to make some attributes mandatory? Currently, this >>>> would >>>> be valid: >>>> <its:disambiguation selector="/text/body/p[@id='****dublin']/> >>>> >>>> >>>> 8) A question to the others in this thread (Guiseppe, Pablo, Raphael, >>>> Sebastian): is this a representation that makes sense to you and that >>>> your >>>> tools could produce? >>>> >>>> 9) A question to the MT guys: is the way "entity and disambiguation" >>>> information is represented here useful for you? >>>> >>>> Best, >>>> >>>> Felix >>>> >>>> 2012/8/3 Tadej Štajner <tadej.stajner@ijs.si> >>>> >>>> Hi, >>>> I incorporated some comments that 'entity' was still conflated from >>>> several distinct things in the data category proposal. Now, we >>>> distinguish >>>> between disambiguation of word sense, ontology concept and entity >>>> instance. >>>> Following that, it seems that 'Disambiguation' was the better name for >>>> the >>>> data category. >>>> >>>> Thanks for everyone's input! >>>> >>>> -- Tadej >>>> >>>> On 02. 08. 2012 17:26, Tadej Štajner wrote: >>>> >>>> Apologies -- wrong link on the previous mail. This is the relevant one: >>>> http://www.w3.org/****International/multilingualweb/** >>>> **lt/track/actions/181<http://www.w3.org/**International/multilingualweb/**lt/track/actions/181> >>>> <http://**www.w3.org/International/**multilingualweb/lt/track/** >>>> actions/181<http://www.w3.org/International/multilingualweb/lt/track/actions/181> >>>> -- Tadej >>>> >>>> On 02. 08. 2012 17:22, Tadej Štajner wrote: >>>> >>>> Dipl. Inf. Sebastian Hellmann >>>> Department of Computer Science, University of Leipzig >>>> Events: >>>> * http://sabre2012.infai.org/****mlode<http://sabre2012.infai.org/**mlode>< >>>> http://sabre2012.infai.org/**mlode <http://sabre2012.infai.org/mlode>>(Leipzig, >>>> Sept. 23-24-25, 2012) >>>> >>>> * http://wole2012.eurecom.fr (*Deadline: July 31st 2012*) >>>> Projects: http://nlp2rdf.org , http://dbpedia.org >>>> Homepage: http://bis.informatik.uni-**le**ipzig.de/SebastianHellmann<http://leipzig.de/SebastianHellmann> >>>> <htt**p://bis.informatik.uni-**leipzig.de/SebastianHellmann<http://bis.informatik.uni-leipzig.de/SebastianHellmann> >>>> Research Group: http://aksw.org >>>> >>>> >>>> >> -- >> Dipl. Inf. Sebastian Hellmann >> Department of Computer Science, University of Leipzig >> Events: >> * http://sabre2012.infai.org/**mlode <http://sabre2012.infai.org/mlode>(Leipzig, Sept. 23-24-25, 2012) >> * http://wole2012.eurecom.fr (*Deadline: July 31st 2012*) >> Projects: http://nlp2rdf.org , http://dbpedia.org >> Homepage: http://bis.informatik.uni-**leipzig.de/SebastianHellmann<http://bis.informatik.uni-leipzig.de/SebastianHellmann> >> Research Group: http://aksw.org >> >> >> > -- Dipl. Inf. Sebastian Hellmann Department of Computer Science, University of Leipzig Events: * http://sabre2012.infai.org/mlode (Leipzig, Sept. 23-24-25, 2012) * http://wole2012.eurecom.fr (*Deadline: July 31st 2012*) Projects: http://nlp2rdf.org , http://dbpedia.org Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann Research Group: http://aksw.org
Received on Monday, 20 August 2012 08:38:01 UTC