W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > January 2013

Re: Issue-67 - Term + Disambiguation

From: Felix Sasaki <fsasaki@w3.org>
Date: Wed, 30 Jan 2013 20:30:50 +0100
Message-ID: <510974EA.7080606@w3.org>
To: public-multilingualweb-lt@w3.org
Hi Yves, Tadej, all,


Am 30.01.13 19:56, schrieb Tadej Stajner:
> Hi, Yves,
> no, this doesn't mean that we're only supporting named entity from now 
> on. We still allow people to annotate with whatever type of text 
> analysis they choose to do (lexical, ontology, etc. ), but we don't 
> care about the level of the analysis. They're now all various values 
> of its:tanIdent*.  in a sense, we aren't reducing functionality, but 
> we're "out-of-scoping" the feature of knowing the disambiguation level.
>
> For example, before:
> That was a
> <span its:disambigGranularity="lexicalConcept" 
> its:disambigSource="Wordnet3.1" 
> its:disambigIdent="*good%3:00:01::*">good</span>
> <span its:disambigGranularity="ontologyConcept" 
> its:disambigIdentRef="http://sw.opencyc.org/2012/05/10/concept/en/Game">game</span> 
> against
> <span its:disambigGranularity="entity" 
> its:disambigIdentRef="http://dbpedia.org/resource/Real_Madrid">Madrid</span>!
>
> is now proposed to be:
> That was a
> <span its:tanSource="Wordnet3.1" 
> its:tanIdent="*good%3:00:01::*">good</span>
> <span 
> its:tanIdentRef="http://sw.opencyc.org/2012/05/10/concept/en/Game">game</span> 
>
> against
> <span tanIdentRef="http://dbpedia.org/resource/Real_Madrid">Madrid</span>!
>
> I'm not entirely sure if this is 'major' or not.
> -- Tadej
>
> On 1/30/2013 7:16 PM, Yves Savourel wrote:
>> Hi all,
>>
>>> - we moved issue-69 disambiguation vs. term forward.
>>> My understanding from the conclusion on the call was:
>>> * people would agree with dropping "granularity" or "qualifier"
>>> from the data category
>>> * people would agree with re-naming attributes and the data
>>> category: to use "tan" instead of "disambig", e.g.
>>> "tan-ident-ref" instead of "disambig-ident-ref". E.g. instead of
>> I wasn't sure I understood correct during the call and was waiting to see the summary.
>>
>> So we would go back to the simple 'named entity' requirement we had originally?
>> Dropping completely lexical and ontology concepts.
>>
>> I'm curious to see how we'll sell that as a non-substantive change: we're removing features. (I'm not against, just pointing that out).

We are removing an attribute and renaming others. Sure, this is a 
borderline case more than the others we have (e.g. regex change). But it 
seems so far we don't have implementations "doing" anything with the 
attribute. That was basically the issue with the levels: nobody had a 
consumption scenario for it. I saw that Yves created a representation of 
the disambiguation output in XLIFF - but my guess is that dropping the 
level and renaming the attributes wouldn't change anything wrt to 
further consumption - no?

With that argumentation I think the removal can be argued as not needing 
another last call draft. But let's see what others think.

>>
>>
>>> * Steps needed anyway for resolving issue-67 are: re-writing
>>> the now "tan" section (previously "disambig"), and potentially
>>> rewriting / merging "Terminology". Opinions on these topics or
>>> volunteers, please step up.
>> It seems the direction we are taking is to reduce to one the types of data the 'disambig/tan' data category can annotate. Merging Terminology would be the equivalent to go back to have different types of data annotated by the same data category.

With regards to the types, see Tadejs explanation - we don't merge 
types, we drop them, since it is hard to foresee interop with them (and 
nobody consumed them anyway).

I am keen to see if people still want to merge then terminology - my 
guess if with dropping the levels, renaming to "tan" - we might be done. 
But let's see.

>>   Then how do we justify to drop lexical and ontology concepts? (especially since there was no comment requesting to drop them).

In the long threads on issue-67, several people brought up the topic of 
dropping - and Christian as the originator of issue-67 sees the dropping 
proposal as a step forward for resolving the issue. So I think we can 
argue that this goes in the right direction.

Hope that these explanations helped?

Best,

Felix
Received on Wednesday, 30 January 2013 19:31:15 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:08:26 UTC