- From: Felix Sasaki <fsasaki@w3.org>
- Date: Mon, 20 Aug 2012 09:44:18 +0200
- To: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
- Cc: Tadej Stajner <tadej.stajner@ijs.si>, "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>, "raphael.troncy@eurecom.fr" <raphael.troncy@eurecom.fr>, "pablomendes@gmail.com" <pablomendes@gmail.com>, "Giuseppe.Rizzo@eurecom.fr" <Giuseppe.Rizzo@eurecom.fr>
- Message-ID: <CAL58czowqvdwjHN+GTjZQ8RS5ac+NzyOB0=tuBNnKGaUytmNQg@mail.gmail.com>
Hi Sebastian, all, thanks, Sebastian. From what you say in the wiki and in the previous mail, I think one could simplify things a lot. The HTML example from Tadej *could* look like this: <html lang="en"> <head> <meta charset="utf-8" /> <title>Entity: Local Test</title> </head> <body> <p><span its-entity-type-ident-ref="http:/nerd.eurecom.fr/ontology#Place" its-disambig-ident-ref="http://dbpedia.org/resource/Dublin">Dublin</span> is the <span its-disambig-ident-ref=" http://www.w3.org/2006/03/wn/wn20/instances/worsense-capital-noun-3">capital</span> of Ireland.</p> </body> </html> That is, no explicit "resource" references for entity type and disambiguation source, and no disambig-type. Also, I think one could get rid of adding this kind of information via global rules - I really don't see a use case for that. Tadej, others, thoughts? Maybe Yves as one of the implementors processing the output and other have some thoughts too? Best, Felix 2012/8/17 Sebastian Hellmann <hellmann@informatik.uni-leipzig.de> > Dear Felix, > to solve this issue I prepared a page: > http://wiki.nlp2rdf.org/wiki/**DBpedia_Spotlight<http://wiki.nlp2rdf.org/wiki/DBpedia_Spotlight> > It is a rough draft, so there are many mistakes, still. Once it is mature, > I will send it to the DBpedia Spotlight and Apache Stanbol lists to get > their feedback. > Note that I don't have a problem with these properties as XML attributes, > where they can naturally occur only once and encoding an implicit > dependency (attribute refering to another attribute) is unproblematic. They > are, however, difficult to handle in RDF, even when declaring them > functional. > I will report back, if there are any news, > > All the best, > Sebastian > > > > > Am 14.08.2012 21:34, schrieb Felix Sasaki: > >> Hi Sebastian, all, >> >> August is taking its tribute ... I am wondering if there any thoughts on >> Sebastian's mail below. It seems that some of the proposed ITS attributes >> are not needed, but I don't have the competence to evaluate this. Thoughts >> from others? Sebastian, could you confirm that the output mentioned in >> this other thread >> >> http://lists.w3.org/Archives/**Public/public-multilingualweb-** >> lt/2012Aug/0168.html<http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0168.html> >> >> is correct for NIF? I then would create a test case for our test suite, >> see >> >> http://lists.w3.org/Archives/**Public/public-multilingualweb-** >> lt-tests/2012Aug/0003.html<http://lists.w3.org/Archives/Public/public-multilingualweb-lt-tests/2012Aug/0003.html> >> >> Thanks, >> >> Felix >> >> Am Donnerstag, 9. August 2012 schrieb Sebastian Hellmann : >> >> Hi Felix, >>> below mostly my opinion on this. Nothing, wrong with including these >>> properties, but they might not make sense in RDF. If you think, that >>> there >>> are people who would really use these properties in RDF, then go ahead >>> and >>> include them. Personally, *I* wouldn't know for what *I* could use them. >>> More comments inline. >>> >>> Am 09.08.2012 15:20, schrieb Felix Sasaki: >>> >>> its:entityTypeSourceRef >>>> >>>> I really do not find this property helpful. >>> Do you see any sense in saying that http://dbpedia.org/resource/**** >>> Dublin <http://dbpedia.org/resource/**Dublin><http://dbpedia.org/** >>> resource/Dublin <http://dbpedia.org/resource/Dublin>>is from >>> >>> http://dbpedia.org ? In the linked data world >>> http://dbpedia.org/resource/ >>> **Dublin <http://dbpedia.org/resource/**Dublin<http://dbpedia.org/resource/Dublin>> >>> comes from >>> http://dbpedia.org/resource/****Dublin<http://dbpedia.org/resource/**Dublin>< >>> http://dbpedia.org/resource/**Dublin<http://dbpedia.org/resource/Dublin>>. >>> So you might specify a way to convert that to ITS, but we might not need >>> >>> an RDF property for this. >>> >>> its:disambigType >>> >>>> "(http://www.w3.org/2005/11/****its/lexicalConcept|<http://www.w3.org/2005/11/**its/lexicalConcept%7C> >>>> <http://**www.w3.org/2005/11/its/**lexicalConcept%7C<http://www.w3.org/2005/11/its/lexicalConcept%7C> >>>> > >>>> http://www.w3.org/2005/11/its/****ontologyConcept|http://www.**w3.**<http://www.w3.org/2005/11/its/**ontologyConcept%7Chttp://www.w3.**> >>>> org/2005/11/its/<http://www.**w3.org/2005/11/its/** >>>> ontologyConcept%7Chttp://www.**w3.org/2005/11/its/<http://www.w3.org/2005/11/its/ontologyConcept%7Chttp://www.w3.org/2005/11/its/> >>>> > >>>> entity)" >>>> >>>> I am unsure about this one. >>> >>> its:entityTypeRef >>> is already rdf:type, so it would be a duplicate to have its:entityTypeRef >>> in RDF. For http://dbpedia.org/resource/****Dublin<http://dbpedia.org/resource/**Dublin> >>> <http://dbpedia.org/**resource/Dublin<http://dbpedia.org/resource/Dublin> >>> >its:**entityTypeRef would be one of: >>> http://dbpedia.org/ontology/****PopulatedPlace<http://dbpedia.org/ontology/**PopulatedPlace> >>> <http://dbpedia.**org/ontology/PopulatedPlace<http://dbpedia.org/ontology/PopulatedPlace> >>> > >>> http://dbpedia.org/ontology/****Settlement<http://dbpedia.org/ontology/**Settlement> >>> <http://dbpedia.org/**ontology/Settlement<http://dbpedia.org/ontology/Settlement> >>> > >>> http://umbel.org/umbel/rc/****PopulatedPlace<http://umbel.org/umbel/rc/**PopulatedPlace> >>> <http://umbel.**org/umbel/rc/PopulatedPlace<http://umbel.org/umbel/rc/PopulatedPlace> >>> > >>> http://dbpedia.org/ontology/****Place<http://dbpedia.org/ontology/**Place>< >>> http://dbpedia.org/ontology/**Place <http://dbpedia.org/ontology/Place>> >>> http://umbel.org/umbel/rc/****Village<http://umbel.org/umbel/rc/**Village>< >>> http://umbel.org/umbel/rc/**Village <http://umbel.org/umbel/rc/Village>> >>> http://umbel.org/umbel/rc/****Location_Underspecified<http://umbel.org/umbel/rc/**Location_Underspecified> >>> <http:/**/umbel.org/umbel/rc/Location_**Underspecified<http://umbel.org/umbel/rc/Location_Underspecified> >>> > >>> http://schema.org/Place >>> http://www.w3.org/2002/07/owl#****Thing<http://www.w3.org/2002/07/owl#**Thing> >>> <http://www.w3.org/**2002/07/owl#Thing<http://www.w3.org/2002/07/owl#Thing> >>> > >>> http://www.opengis.net/gml/_****Feature<http://www.opengis.net/gml/_**Feature> >>> <http://www.opengis.**net/gml/_Feature<http://www.opengis.net/gml/_Feature> >>> > >>> + >>> http:/nerd.eurecom.fr/****ontology#Place<http://nerd.eurecom.fr/**ontology#Place> >>> <http://nerd.**eurecom.fr/ontology#Place<http://nerd.eurecom.fr/ontology#Place> >>> > >>> >>> >>> If you have a Problem with this plurality. Then it might be good to >>> include an annotation property its:preferedEntityTypeRef >>> So the data is there already in RDF, the problem is rather to find a way >>> to convert it back to ITS. >>> >>> All the best, >>> Sebastian >>> >>> >>> >>> Thanks, >>> >>> >>> Felix >>> >>> 2012/8/9 Felix Sasaki <fsasaki@w3.org> >>> >>> Thanks for this, Tadej, looks good. There is just one comment I don't >>> see >>> reflected: >>> >>> 7) A question on the data category in general and the "rules" element: >>> does it make sense to make some attributes mandatory? Currently, this >>> would >>> be valid: >>> <its:disambiguation selector="/text/body/p[@id='****dublin']/> >>> >>> >>> >>> It seems that still all metadata items / attributes are optional. Is >>> there >>> a way to be more specific about what must or must not appear together, >>> what >>> is optional etc? >>> >>> Best, >>> >>> Felix >>> >>> 2012/8/9 Tadej Stajner <tadej.stajner@ijs.si> >>> >>> Hi, >>> thanks for the tips. I covered them, and I agree towards removing the >>> local XPath, since it has very limited use. Here is another incorporating >>> all these comments. >>> -- Tadej >>> >>> On 8/3/2012 1:07 PM, Felix Sasaki wrote: >>> >>> Hi Tadej, all, >>> >>> thanks a lot for this. Just a few comments / questions: >>> >>> 1) About "The information applies to the textual content of the >>> element, including child elements and attributes.": wouldn't it make more >>> sense to say that this applies to only the content of the element? E.g. >>> if >>> you annotate the "span" element in >>> >>> <p>I have seen <span id="timbl"><span class="firstame">Tim</span> >>> <span >>> class="lastname">Berners-Lee</****span></span> in the olympics opening >>> >>> ceremony</p> >>> >>> You want to express disambiguation information about the "span" >>> element >>> with the "id" attribute, but not about the "id" attribute or the nested >>> span elements. So inheritance probably should be: "There is no >>> inheritance". What do you think? >>> >>> >>> 2) About "The Entity data category can be expressed with global rules, >>> or locally on an individual element.": This should probably be "The >>> Disambiguation data category can be expressed with global rules, or >>> locally >>> on an individual element." >>> >>> 3) About local markup: for other data categories, we don't have the >>> "pointer" attributes as local markup, since processing of XPath in local >>> markup can be very expensive. So I would propose to drop the local >>> pointer >>> attributes here too. >>> >>> 4) In the table at the end, "Global pointing to existing information" >>> should be "yes" I think. >>> >>> 5) This selector >>> <its:disambiguation selector="/text/body/p/#****dublin" ... >>> In XPath should be >>> <its:disambiguation selector="/text/body/p[@id='****dublin'] >>> >>> >>> 6) To follow the conventions from other data categories, the >>> "its:disambiguation" element should probably be called >>> "its:disambiguationRule". >>> >>> 7) A question on the data category in general and the "rules" element: >>> does it make sense to make some attributes mandatory? Currently, this >>> would >>> be valid: >>> <its:disambiguation selector="/text/body/p[@id='****dublin']/> >>> >>> >>> 8) A question to the others in this thread (Guiseppe, Pablo, Raphael, >>> Sebastian): is this a representation that makes sense to you and that >>> your >>> tools could produce? >>> >>> 9) A question to the MT guys: is the way "entity and disambiguation" >>> information is represented here useful for you? >>> >>> Best, >>> >>> Felix >>> >>> 2012/8/3 Tadej Štajner <tadej.stajner@ijs.si> >>> >>> Hi, >>> I incorporated some comments that 'entity' was still conflated from >>> several distinct things in the data category proposal. Now, we >>> distinguish >>> between disambiguation of word sense, ontology concept and entity >>> instance. >>> Following that, it seems that 'Disambiguation' was the better name for >>> the >>> data category. >>> >>> Thanks for everyone's input! >>> >>> -- Tadej >>> >>> On 02. 08. 2012 17:26, Tadej Štajner wrote: >>> >>> Apologies -- wrong link on the previous mail. This is the relevant one: >>> http://www.w3.org/****International/multilingualweb/** >>> **lt/track/actions/181<http://www.w3.org/**International/multilingualweb/**lt/track/actions/181> >>> <http://**www.w3.org/International/**multilingualweb/lt/track/** >>> actions/181<http://www.w3.org/International/multilingualweb/lt/track/actions/181> >>> > >>> >>> -- Tadej >>> >>> On 02. 08. 2012 17:22, Tadej Štajner wrote: >>> >>> Dipl. Inf. Sebastian Hellmann >>> Department of Computer Science, University of Leipzig >>> Events: >>> * http://sabre2012.infai.org/****mlode<http://sabre2012.infai.org/**mlode>< >>> http://sabre2012.infai.org/**mlode <http://sabre2012.infai.org/mlode>>(Leipzig, >>> Sept. 23-24-25, 2012) >>> >>> * http://wole2012.eurecom.fr (*Deadline: July 31st 2012*) >>> Projects: http://nlp2rdf.org , http://dbpedia.org >>> Homepage: http://bis.informatik.uni-**le**ipzig.de/SebastianHellmann<http://leipzig.de/SebastianHellmann> >>> <htt**p://bis.informatik.uni-**leipzig.de/SebastianHellmann<http://bis.informatik.uni-leipzig.de/SebastianHellmann> >>> > >>> Research Group: http://aksw.org >>> >>> >>> > > -- > Dipl. Inf. Sebastian Hellmann > Department of Computer Science, University of Leipzig > Events: > * http://sabre2012.infai.org/**mlode <http://sabre2012.infai.org/mlode>(Leipzig, Sept. 23-24-25, 2012) > * http://wole2012.eurecom.fr (*Deadline: July 31st 2012*) > Projects: http://nlp2rdf.org , http://dbpedia.org > Homepage: http://bis.informatik.uni-**leipzig.de/SebastianHellmann<http://bis.informatik.uni-leipzig.de/SebastianHellmann> > Research Group: http://aksw.org > > > -- Felix Sasaki DFKI / W3C Fellow
Received on Monday, 20 August 2012 07:44:44 UTC