W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > August 2012

Re: [all] Call for consensus on disambiguation - feedback integrated [ACTION-181]

From: Felix Sasaki <fsasaki@w3.org>
Date: Fri, 3 Aug 2012 13:07:56 +0200
Message-ID: <CAL58czqPkQzh7_AXdf65KnKiV5RqBD=5A8rhhS86A-NS=qDwkA@mail.gmail.com>
To: Tadej Štajner <tadej.stajner@ijs.si>
Cc: public-multilingualweb-lt@w3.org, raphael.troncy@eurecom.fr, pablomendes@gmail.com, Giuseppe.Rizzo@eurecom.fr, Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
Hi Tadej, all,

thanks a lot for this. Just a few comments / questions:

1) About "The information applies to the textual content of the element,
including child elements and attributes.": wouldn't it make more sense to
say that this applies to only the content of the element? E.g. if you
annotate the "span" element in

<p>I have seen <span id="timbl"><span class="firstame">Tim</span> <span
class="lastname">Berners-Lee</span></span> in the olympics opening
ceremony</p>

You want to express disambiguation information about the "span" element
with the "id" attribute, but not about the "id" attribute or the nested
span elements. So inheritance probably should be: "There is no
inheritance". What do you think?


2) About "The Entity data category can be expressed with global rules, or
locally on an individual element.": This should probably be "The
Disambiguation data category can be expressed with global rules, or locally
on an individual element."

3) About local markup: for other data categories, we don't have the
"pointer" attributes as local markup, since processing of XPath in local
markup can be very expensive. So I would propose to drop the local pointer
attributes here too.

4) In the table at the end, "Global pointing to existing information"
should be "yes" I think.

5) This selector
<its:disambiguation selector="/text/body/p/#dublin" ...
In XPath should be
<its:disambiguation selector="/text/body/p[@id='dublin']

6) To follow the conventions from other data categories, the
"its:disambiguation" element should probably be called
"its:disambiguationRule".

7) A question on the data category in general and the "rules" element: does
it make sense to make some attributes mandatory? Currently, this would be
valid:
<its:disambiguation selector="/text/body/p[@id='dublin']/>

8) A question to the others in this thread (Guiseppe, Pablo, Raphael,
Sebastian): is this a representation that makes sense to you and that your
tools could produce?

9) A question to the MT guys: is the way "entity and disambiguation"
information is represented here useful for you?

Best,

Felix

2012/8/3 Tadej Štajner <tadej.stajner@ijs.si>

> Hi,
> I incorporated some comments that 'entity' was still conflated from
> several distinct things in the data category proposal. Now, we distinguish
> between disambiguation of word sense, ontology concept and entity instance.
> Following that, it seems that 'Disambiguation' was the better name for the
> data category.
>
> Thanks for everyone's input!
>
> -- Tadej
>
> On 02. 08. 2012 17:26, Tadej Štajner wrote:
>
>> Apologies -- wrong link on the previous mail. This is the relevant one:
>> http://www.w3.org/**International/multilingualweb/**lt/track/actions/181<http://www.w3.org/International/multilingualweb/lt/track/actions/181>
>> -- Tadej
>>
>> On 02. 08. 2012 17:22, Tadej Štajner wrote:
>>
>>> Hi, all,
>>> this is the integration of the feedback points from the last call on the
>>> Entity data category and subsequently on the mailing list. I cleaned up and
>>> defined the terms, so it better fits both use cases, lexical as well as
>>> conceptual disambiguation, and introduced XPath variants of the attributes
>>> since they were used in the examples, but not defined anywhere.
>>>
>>> I'd ask anyone who's interested in taking another look. Otherwise, I
>>> think we can move forward.
>>>
>>> -- Tadej
>>>
>>> Related:
>>> http://lists.w3.org/Archives/**Public/public-multilingualweb-**
>>> lt/2012Jul/0280.html<http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0280.html>
>>> http://lists.w3.org/Archives/**Public/public-multilingualweb-**
>>> lt/2012Jul/0288.html<http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0288.html>
>>> http://www.w3.org/**International/multilingualweb/**lt/track/issues/181<http://www.w3.org/International/multilingualweb/lt/track/issues/181>
>>>
>>>
>>>
>>> On 26. 07. 2012 15:47, Tadej Štajner wrote:
>>>
>>>> Hi all,
>>>> (cc-ing additional people who may be interested in this),
>>>>
>>>> this may be relevant at today's call. Here's a summary and integration
>>>> of what was going on around the named entity and disambiguation data
>>>> categories, along with usage in RDFa Lite.
>>>>
>>>> -- Tadej
>>>>
>>>> Related in https://www.w3.org/**International/multilingualweb/**
>>>> lt/track/ <https://www.w3.org/International/multilingualweb/lt/track/>:
>>>> [ISSUE-2]
>>>> [ISSUE-18]
>>>> [ISSUE-29]
>>>> [ISSUE-35]
>>>> [ACTION-164]
>>>>
>>>>
>>>
>>
>


-- 
Felix Sasaki
DFKI / W3C Fellow
Received on Friday, 3 August 2012 11:08:29 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:31:50 UTC