- From: Felix Sasaki <fsasaki@w3.org>
- Date: Wed, 30 Jan 2013 22:14:59 +0100
- To: public-multilingualweb-lt@w3.org
- Message-ID: <51098D53.6080906@w3.org>
Am 30.01.13 21:32, schrieb Mārcis Pinnis: > > Hi Felix, all, > > I think that by Issue-67 (and Issue-69 in the minutes) the Issue-68 is > meant, right?! > > There is some numbering confusion right now. > > Issue-67 says: „ISSUE-67: Change definition of regular expression for > allowed characters” > > Issue-68 says: „ISSUE-68: Disambiguation (and term)” > Sorry for the confusion, Mārcis - yes, this thread and the minutes http://www.w3.org/2013/01/30-mlw-lt-minutes.html#item05 are about issue-68. Best, Felix > Issue-69 says: „ISSUE-69: recursive nesting of external rules” > > Best regards, > > Mārcis ;o) > > *From:*Felix Sasaki [mailto:fsasaki@w3.org] > *Sent:* Wednesday, January 30, 2013 9:31 PM > *To:* public-multilingualweb-lt@w3.org > *Subject:* Re: Issue-67 - Term + Disambiguation > > Hi Yves, Tadej, all, > > > Am 30.01.13 19:56, schrieb Tadej Stajner: > > Hi, Yves, > no, this doesn't mean that we're only supporting named entity from > now on. We still allow people to annotate with whatever type of > text analysis they choose to do (lexical, ontology, etc. ), but we > don't care about the level of the analysis. They're now all > various values of its:tanIdent*. in a sense, we aren't reducing > functionality, but we're "out-of-scoping" the feature of knowing > the disambiguation level. > > For example, before: > That was a > <span its:disambigGranularity="lexicalConcept" > its:disambigSource="Wordnet3.1" > its:disambigIdent="*good%3:00:01::*">good</span> > <span its:disambigGranularity="ontologyConcept" > its:disambigIdentRef="http://sw.opencyc.org/2012/05/10/concept/en/Game" > <http://sw.opencyc.org/2012/05/10/concept/en/Game>>game</span> > against > <span its:disambigGranularity="entity" > its:disambigIdentRef="http://dbpedia.org/resource/Real_Madrid" > <http://dbpedia.org/resource/Real_Madrid>>Madrid</span>! > > is now proposed to be: > That was a > <span its:tanSource="Wordnet3.1" > its:tanIdent="*good%3:00:01::*">good</span> > <span > its:tanIdentRef="http://sw.opencyc.org/2012/05/10/concept/en/Game" > <http://sw.opencyc.org/2012/05/10/concept/en/Game>>game</span> > against > <span tanIdentRef="http://dbpedia.org/resource/Real_Madrid" > <http://dbpedia.org/resource/Real_Madrid>>Madrid</span>! > > I'm not entirely sure if this is 'major' or not. > -- Tadej > > On 1/30/2013 7:16 PM, Yves Savourel wrote: > > Hi all, > > > > - we moved issue-69 disambiguation vs. term forward. > > My understanding from the conclusion on the call was: > > * people would agree with dropping "granularity" or "qualifier" > > from the data category > > * people would agree with re-naming attributes and the data > > category: to use "tan" instead of "disambig", e.g. > > "tan-ident-ref" instead of "disambig-ident-ref". E.g. instead of > > I wasn't sure I understood correct during the call and was waiting to see the summary. > > > > So we would go back to the simple 'named entity' requirement we had originally? > > Dropping completely lexical and ontology concepts. > > > > I'm curious to see how we'll sell that as a non-substantive change: we're removing features. (I'm not against, just pointing that out). > > > We are removing an attribute and renaming others. Sure, this is a > borderline case more than the others we have (e.g. regex change). But > it seems so far we don't have implementations "doing" anything with > the attribute. That was basically the issue with the levels: nobody > had a consumption scenario for it. I saw that Yves created a > representation of the disambiguation output in XLIFF - but my guess is > that dropping the level and renaming the attributes wouldn't change > anything wrt to further consumption - no? > > With that argumentation I think the removal can be argued as not > needing another last call draft. But let's see what others think. > > > > > > > > > * Steps needed anyway for resolving issue-67 are: re-writing > > the now "tan" section (previously "disambig"), and potentially > > rewriting / merging "Terminology". Opinions on these topics or > > volunteers, please step up. > > It seems the direction we are taking is to reduce to one the types of data the 'disambig/tan' data category can annotate. Merging Terminology would be the equivalent to go back to have different types of data annotated by the same data category. > > > With regards to the types, see Tadejs explanation - we don't merge > types, we drop them, since it is hard to foresee interop with them > (and nobody consumed them anyway). > > I am keen to see if people still want to merge then terminology - my > guess if with dropping the levels, renaming to "tan" - we might be > done. But let's see. > > > Then how do we justify to drop lexical and ontology concepts? (especially since there was no comment requesting to drop them). > > > In the long threads on issue-67, several people brought up the topic > of dropping - and Christian as the originator of issue-67 sees the > dropping proposal as a step forward for resolving the issue. So I > think we can argue that this goes in the right direction. > > Hope that these explanations helped? > > Best, > > Felix >
Received on Wednesday, 30 January 2013 21:15:23 UTC