W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > January 2013

Re: Issue-68 (not 67) - Term + Disambiguation

From: Felix Sasaki <fsasaki@w3.org>
Date: Wed, 30 Jan 2013 22:14:59 +0100
Message-ID: <51098D53.6080906@w3.org>
To: public-multilingualweb-lt@w3.org
Am 30.01.13 21:32, schrieb Mārcis Pinnis:
>
> Hi Felix, all,
>
> I think that by Issue-67 (and Issue-69 in the minutes) the Issue-68 is 
> meant, right?!
>
> There is some numbering confusion right now.
>
> Issue-67 says: „ISSUE-67: Change definition of regular expression for 
> allowed characters”
>
> Issue-68 says: „ISSUE-68: Disambiguation (and term)”
>

Sorry for the confusion, Mārcis - yes, this thread and the minutes
http://www.w3.org/2013/01/30-mlw-lt-minutes.html#item05
are about issue-68.

Best,

Felix

> Issue-69 says: „ISSUE-69: recursive nesting of external rules”
>
> Best regards,
>
> Mārcis ;o)
>
> *From:*Felix Sasaki [mailto:fsasaki@w3.org]
> *Sent:* Wednesday, January 30, 2013 9:31 PM
> *To:* public-multilingualweb-lt@w3.org
> *Subject:* Re: Issue-67 - Term + Disambiguation
>
> Hi Yves, Tadej, all,
>
>
> Am 30.01.13 19:56, schrieb Tadej Stajner:
>
>     Hi, Yves,
>     no, this doesn't mean that we're only supporting named entity from
>     now on. We still allow people to annotate with whatever type of
>     text analysis they choose to do (lexical, ontology, etc. ), but we
>     don't care about the level of the analysis. They're now all
>     various values of its:tanIdent*.  in a sense, we aren't reducing
>     functionality, but we're "out-of-scoping" the feature of knowing
>     the disambiguation level.
>
>     For example, before:
>     That was a
>     <span its:disambigGranularity="lexicalConcept"
>     its:disambigSource="Wordnet3.1"
>     its:disambigIdent="*good%3:00:01::*">good</span>
>     <span its:disambigGranularity="ontologyConcept"
>     its:disambigIdentRef="http://sw.opencyc.org/2012/05/10/concept/en/Game"
>     <http://sw.opencyc.org/2012/05/10/concept/en/Game>>game</span>
>     against
>     <span its:disambigGranularity="entity"
>     its:disambigIdentRef="http://dbpedia.org/resource/Real_Madrid"
>     <http://dbpedia.org/resource/Real_Madrid>>Madrid</span>!
>
>     is now proposed to be:
>     That was a
>     <span its:tanSource="Wordnet3.1"
>     its:tanIdent="*good%3:00:01::*">good</span>
>     <span
>     its:tanIdentRef="http://sw.opencyc.org/2012/05/10/concept/en/Game"
>     <http://sw.opencyc.org/2012/05/10/concept/en/Game>>game</span>
>     against
>     <span tanIdentRef="http://dbpedia.org/resource/Real_Madrid"
>     <http://dbpedia.org/resource/Real_Madrid>>Madrid</span>!
>
>     I'm not entirely sure if this is 'major' or not.
>     -- Tadej
>
>     On 1/30/2013 7:16 PM, Yves Savourel wrote:
>
>         Hi all,
>
>           
>
>             - we moved issue-69 disambiguation vs. term forward.
>
>             My understanding from the conclusion on the call was:
>
>             * people would agree with dropping "granularity" or "qualifier"
>
>             from the data category
>
>             * people would agree with re-naming attributes and the data
>
>             category: to use "tan" instead of "disambig", e.g.
>
>             "tan-ident-ref" instead of "disambig-ident-ref". E.g. instead of
>
>         I wasn't sure I understood correct during the call and was waiting to see the summary.
>
>           
>
>         So we would go back to the simple 'named entity' requirement we had originally?
>
>         Dropping completely lexical and ontology concepts.
>
>           
>
>         I'm curious to see how we'll sell that as a non-substantive change: we're removing features. (I'm not against, just pointing that out).
>
>
> We are removing an attribute and renaming others. Sure, this is a 
> borderline case more than the others we have (e.g. regex change). But 
> it seems so far we don't have implementations "doing" anything with 
> the attribute. That was basically the issue with the levels: nobody 
> had a consumption scenario for it. I saw that Yves created a 
> representation of the disambiguation output in XLIFF - but my guess is 
> that dropping the level and renaming the attributes wouldn't change 
> anything wrt to further consumption - no?
>
> With that argumentation I think the removal can be argued as not 
> needing another last call draft. But let's see what others think.
>
>
>           
>
>           
>
>           
>
>             * Steps needed anyway for resolving issue-67 are: re-writing
>
>             the now "tan" section (previously "disambig"), and potentially
>
>             rewriting / merging "Terminology". Opinions on these topics or
>
>             volunteers, please step up.
>
>         It seems the direction we are taking is to reduce to one the types of data the 'disambig/tan' data category can annotate. Merging Terminology would be the equivalent to go back to have different types of data annotated by the same data category.
>
>
> With regards to the types, see Tadejs explanation - we don't merge 
> types, we drop them, since it is hard to foresee interop with them 
> (and nobody consumed them anyway).
>
> I am keen to see if people still want to merge then terminology - my 
> guess if with dropping the levels, renaming to "tan" - we might be 
> done. But let's see.
>
>
>           Then how do we justify to drop lexical and ontology concepts? (especially since there was no comment requesting to drop them).
>
>
> In the long threads on issue-67, several people brought up the topic 
> of dropping - and Christian as the originator of issue-67 sees the 
> dropping proposal as a step forward for resolving the issue. So I 
> think we can argue that this goes in the right direction.
>
> Hope that these explanations helped?
>
> Best,
>
> Felix
>
Received on Wednesday, 30 January 2013 21:15:23 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:08:26 UTC