W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > January 2013

Re: Issue-67 - Term + Disambiguation

From: Tadej Stajner <tadej.stajner@ijs.si>
Date: Wed, 30 Jan 2013 19:56:02 +0100
Message-ID: <51096CC2.8010308@ijs.si>
To: public-multilingualweb-lt@w3.org
Hi, Yves,
no, this doesn't mean that we're only supporting named entity from now 
on. We still allow people to annotate with whatever type of text 
analysis they choose to do (lexical, ontology, etc. ), but we don't care 
about the level of the analysis. They're now all various values of 
its:tanIdent*.  in a sense, we aren't reducing functionality, but we're 
"out-of-scoping" the feature of knowing the disambiguation level.

For example, before:
That was a
<span its:disambigGranularity="lexicalConcept" 
its:disambigSource="Wordnet3.1" 
its:disambigIdent="*good%3:00:01::*">good</span>
<span its:disambigGranularity="ontologyConcept" 
its:disambigIdentRef="http://sw.opencyc.org/2012/05/10/concept/en/Game">game</span> 
against
<span its:disambigGranularity="entity" 
its:disambigIdentRef="http://dbpedia.org/resource/Real_Madrid">Madrid</span>!

is now proposed to be:
That was a
<span its:tanSource="Wordnet3.1" its:tanIdent="*good%3:00:01::*">good</span>
<span 
its:tanIdentRef="http://sw.opencyc.org/2012/05/10/concept/en/Game">game</span> 

against
<span tanIdentRef="http://dbpedia.org/resource/Real_Madrid">Madrid</span>!

I'm not entirely sure if this is 'major' or not.
-- Tadej

On 1/30/2013 7:16 PM, Yves Savourel wrote:
> Hi all,
>
>> - we moved issue-69 disambiguation vs. term forward.
>> My understanding from the conclusion on the call was:
>> * people would agree with dropping "granularity" or "qualifier"
>> from the data category
>> * people would agree with re-naming attributes and the data
>> category: to use "tan" instead of "disambig", e.g.
>> "tan-ident-ref" instead of "disambig-ident-ref". E.g. instead of
> I wasn't sure I understood correct during the call and was waiting to see the summary.
>
> So we would go back to the simple 'named entity' requirement we had originally?
> Dropping completely lexical and ontology concepts.
>
> I'm curious to see how we'll sell that as a non-substantive change: we're removing features. (I'm not against, just pointing that out).
>
>
>> * Steps needed anyway for resolving issue-67 are: re-writing
>> the now "tan" section (previously "disambig"), and potentially
>> rewriting / merging "Terminology". Opinions on these topics or
>> volunteers, please step up.
> It seems the direction we are taking is to reduce to one the types of data the 'disambig/tan' data category can annotate. Merging Terminology would be the equivalent to go back to have different types of data annotated by the same data category. Then how do we justify to drop lexical and ontology concepts? (especially since there was no comment requesting to drop them).
>
> cheers,
> -yves
>
>
>
Received on Wednesday, 30 January 2013 18:56:34 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:08:26 UTC