W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > May 2012

RE: [ACTION-80] consider consolidation of mtDisambiguationData, namedEntity, terminology and textAnalyticsAnnotation

From: Thomas Ruedesheim <thomas.ruedesheim@lucysoftware.com>
Date: Thu, 10 May 2012 13:38:50 +0200
Message-ID: <D0689FBE85FD1246A4EE317903897919F44C7E@team.lucysoftware.com>
To: "Tadej Stajner" <tadej.stajner@ijs.si>
Cc: <public-multilingualweb-lt@w3.org>
 
Hi Tadej, hi all,

You are apparently right, these data categories are strongly
interrelated. In our opinion, 'textAnalysisAnnotation' is the umbrella
for the remaining categories in the Terminology section. We would
suggest to drop it in favour of the others.

I would rename 'mtDisamiguation' as 'disambiguation', because its usage
might not be MT specific. As Pedro already said, this tag may add some
info to the more general 'domain' category without proposing concrete
target terms. Its only attribute could be:
  'semantic selector': a URI pointing into a common ontology.

Both 'namedEntity' and 'terminology' categories seem to be clear (see
below).

Best,
Thomas

-----Original Message-----
From: Tadej Stajner [mailto:tadej.stajner@ijs.si] 
Sent: Mittwoch, 9. Mai 2012 19:50
To: public-multilingualweb-lt@w3.org
Subject: [ACTION-80] consider consolidation of mtDisambiguationData,
namedEntity, terminology and textAnalyticsAnnotation

Hi, all,

this question is mostly directed to people working in MT with regard to
disambiguation.

Since we came to a conclusion that there is strong overlap between the
following data categories, we're consolidating them:
mtDisambiguationData
namedEntity
terminology
textAnalyticsAnnotation

First of all, there is an obvious common part to the first three. Let's
call it the 'concept mention' recipe. It's meant to represent that some
fragment of text is lexicalizing (mentioning) some concept with an URI.

namedEntity has the following specifics:
- type of entity (pointing to an URI, describing that type)
- alternative labels (names in different languages)

terminology has the following specifics:
- terminology lexicon
- alternative labels

mtDisambiguation also has the concept URI, but additionally define
- 'disambiguation data'
- 'semantic selector'

The open question is: that do these two additional attributes bring any
additional infomation if we already have the fragment disambiguated with
the URI?

  If not, is there anything else in mtDisambiguation that could not be
covered by the namedEntity and terminology categories?

thanks for the input,
-- Tadej
Received on Thursday, 10 May 2012 17:40:31 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 9 June 2013 00:24:55 UTC