Re: Links vs. identifiers (Re: [ACTION-94]: go and find examples of concept ontology (semantic features of terms as opposed to domain type ontologies)) from Tadej Stajner on 2012-06-18 (public-multilingualweb-lt@w3.org from June 2012)

From: Tadej Stajner <tadej.stajner@ijs.si>
Date: Mon, 18 Jun 2012 13:15:07 +0200
To: public-multilingualweb-lt@w3.org
Message-ID: <4FDF0DBB.1090205@ijs.si>
Hi,
  just to follow up on this.  In my examples, I made the distinction 
between different pointer attributes (its-entity-concept-ref vs. 
its-meaning-ref vs. its-term-ref, ..) simply because it meant writing 
one attribute instead of two (<span its-entity its-concept-ref="">, 
<span its-term its-concept-ref="">,...) and assumed that defining the 
definite set of these relationships is our job. Is there a compelling 
case to keep this open? I personally like the "rel" value directory idea 
that HTML5 has, but going this way could be less thorough in terms of 
validation - we couldn't even assume that the its-concept-ref values 
even dereference anywhere?

-- Tadej



On 6/12/2012 1:24 AM, David Lewis wrote:
> Hi Felix,
> Concerning the w3c hosting the semantic resources index.
>
> While I can see that helps with uptake, the scale of the maintenance 
> task depends on the number of resources involved, which might scale 
> with domains and languages, plus the tests we'd need to apply to 
> ensure it was suitable, I don't know how much of a handle we have on 
> these currently?
>
> The other issue is that we can't mandate using this index as the only 
> approach, since parties that want to exchange references to private 
> semantic resources would be unable to adopt this approach.
>
> Cheers,
> Dave
>
> On 9 Jun 2012, at 05:31, Felix Sasaki <fsasaki@w3.org 
> <mailto:fsasaki@w3.org>> wrote:
>
>> Hi Dave,
>>
>> with your proposal we run into two issues.
>>
>> First, who decides what people will provide with these URIs? Will 
>> people trust
>> http://www.sfs.uni-tuebingen.de/lsd/index.shtml
>> to provide guidance about getting a machine readable form of the 
>> german wordnet?
>>
>> Second, and more difficult, people will have a hard time to agree on 
>> this list of values
>> "onto-concept | sem-net-node | terminology-entry | eqiv-translation"
>> as you mention yourself.
>>
>> I would rather propose the following approach, first given as markup 
>> that we'd define:
>>
>> <span its-entity 
>> entityref="http://www.w3.org/2012/semantic-resources/" 
>> its-selector="gn-synset_loschen_3">löschen</span>
>>
>>
>> At
>>
>> http://www.w3.org/2012/semantic-resources/
>>
>> we would have a table with two columns:
>>
>> 1) Name of semantic resource and infos on how to get it, including 
>> licensing infos that we need to make people aware of
>>
>> 2) prefix to be used in the "selector" attribute, e.g. "gn" for germanet
>>
>>
>>
>> The benefit of this proposal is that implementors (and us) don't have 
>> to decide about values like
>> onto-concept | sem-net-node | terminology-entry | eqiv-translation
>> Our working group would maintain
>>
>> http://www.w3.org/2012/semantic-resources/
>>
>> and add entries for new resources - *if* there is an agreement with 
>> the host of the resource. I know that this puts a burden on us, 
>> namely to talk directly to the hosts. But we need to do that anyway, 
>> otherwise no implementor will be able to make use of the information.
>>
>> WRT "hosting by W3C or others", see the approach of link types in HTML5:
>>
>> I) You have hard wired link types at 
>> http://dev.w3.org/html5/spec/links.html#linkTypes
>> II) You have extensions at 
>> http://microformats.org/wiki/existing-rel-values#HTML5_link_type_extensions
>>
>> The point is that only with I) you achieve broad adoption, and II) is 
>> driven by a community process: see at
>> http://microformats.org/wiki/existing-rel-values#HTML5_link_type_extensions
>> "Before you register a new value .... Entries lacking any of the 
>> above required data will likely be removed"
>>
>> Again, without us engaging the host of the resources directly and 
>> trying to achieve I) as much as possible and II) with a well defined 
>> process, a link to external information is pretty likely to be useless.
>>
>> A question on ISOCAT: at
>> http://www.isocat.org/interface/index.html
>> I see a browsable structure of categories, but no dump to download 
>> them, and no URI to identify ISOCAT in general. I may just miss that, 
>> but could you point me (or other potential implementors) to that? For 
>> Germanet or wordnet that information is easy to find, btw.
>>
>> Thanks,
>>
>> Felix
>>
>> 2012/6/9 Dave Lewis <dave.lewis@cs.tcd.ie <mailto:dave.lewis@cs.tcd.ie>>
>>
>>     Hi Felix,
>>     I think option 'a' makes the most sense. If language resource
>>     providers want their language resources to be access over the
>>     web, then they should be well motivated to provide stable URIs.
>>     There seems plenty who are, like the wordnet site you cite, like
>>     ISOCAT and this is inherent with providers taking the semweb
>>     ontology route.
>>
>>     If they aren't willing to provide stable URIs I'm not sure it
>>     should be the W3C's job to compensate for this. I'm not clear why
>>     this was done in the unicode codepoint collation case - perhaps
>>     they were so key that the W3C made this a special case?
>>
>>     I've two other questions we can follow up with next week:
>>     1) if there is a stable URI for the particular resource item, do
>>     we need a separate attribute for the  resource and then a
>>     selector for the item if it is only ever a fragment ID? Would a
>>     single fragment URI not suffice?
>>
>>     2) I like Pedro categorisation of different resource types. But
>>     as pointed out in the thread, this still isn't sufficient by
>>     itself to enable a client to understand how to interpret the
>>     resource - this requires some detailed knowledge of the resource
>>     schema in the general case. So does it make sense to hardwire
>>     this into an attribute name? Might it be better to have it as a
>>     value to an attribute like resource type? e.g.
>>     its-referenced-resource-type : onto-concept | sem-net-node |
>>     terminology-entry | eqiv-translation
>>
>>     Given we are not sure if this is the right enumeration, at least
>>     this way we could specify this as non-normative values, that
>>     could be added to later.
>>
>>     The ideal would be if referenced resources also offered a URL to
>>     a standardised resource meta-data record, such as the META-SHARE
>>     meta-data model, which contained sufficient knowledge for a
>>     client to interpret the fragment URI (or URI and selector) correctly.
>>
>>     There will be many of the right people in Dublin to have a good
>>     discussion on this.
>>
>>     cheers,
>>     Dave
>>
>>     On 08/06/2012 16:39, Felix Sasaki wrote:
>>>     Hi Pedro all,
>>>
>>>     2012/6/8 Pedro L. Díez Orzas <pedro.diez@linguaserve.com
>>>     <mailto:pedro.diez@linguaserve.com>>
>>>
>>>         Dear Tadej, Felix, Yves, Dave, all,
>>>
>>>         I checked with some expert people and told me the following:
>>>
>>>         /It would be great if links to wordnet can be included in
>>>         the annotations. The best thing to do would be to use the
>>>         open linked data versions of wordnet:/
>>>
>>>         //
>>>
>>>         /http://thedatahub.org/dataset/vu-wordnet///
>>>
>>>         //
>>>
>>>         /It has URIs for synsets (actually sense meanings but I
>>>         convinced them they need to shift to synset IDs, which they
>>>         will do in the near future). English synsets are good for
>>>         any language since the other languages link to English
>>>         (still as an Inter Lingual Index). Eventually, other
>>>         wordnets will also be published as linked open data./
>>>
>>>         //
>>>
>>>         /Another thing is domain tags. WordnetDomain tags are used
>>>         here (Dewey system). Since it is linked to English Wordnet
>>>         it is linked to any synset in any language linked to
>>>         English. That will be a very useful semantic tag also for
>>>         translation./
>>>
>>>         I think this is a right way to reinforce the connection
>>>         between MLS-LT and open linked data. I hope it helps.
>>>
>>>
>>>
>>>     The above is great. I just want to make sure that we are on sync
>>>     with one aspect: we need sustainable *identifiers* for the
>>>     resources you mentioned. Let me try to make the difference clear
>>>     with the "codepoint based collation" example below:
>>>
>>>     - An application that wants to use code point based collation
>>>     needs the data tables for that
>>>     - http://www.w3.org/2005/xpath-functions/collation/codepoint/ is
>>>     not a way to download the data tables, but to identify that kind
>>>     of collation
>>>
>>>     Take as an example related to our area the way wordnet is used
>>>     in this XQuery processor
>>>     http://cf.zorba-xquery.com.s3.amazonaws.com/doc/zorba-2.0/zorba/html/ft_thesaurus.html
>>>
>>>     [
>>>
>>>     let $x := <msg>affluent man</msg>
>>>
>>>     return $x contains text "wealthy"
>>>
>>>     using thesaurus at "http://wordnet.princeton.edu"
>>>
>>>     ]
>>>
>>>
>>>     The "using thesaurus at "http://wordnet.princeton.edu"
>>>      statement does not mean that the thesaurus is downloaded from
>>>     the wordnet site at princeton. It just means that the XQuery
>>>     processor evokes the cached version of wordnet, which is
>>>     identified by the http://wordnet.princeton.edu
>>>
>>>
>>>     For our scenarios, I assume processing steps like this
>>>
>>>     1) Automatic annotation leading to e.g. this
>>>
>>>     <span
>>>     its-disambiguation its-semantic-network-ref="http://www.sfs.uni-tuebingen.de/lsd/index.shtml" its-selector="#synset_loschen_3">löschen</span>
>>>
>>>      2) An application that knows there to find the resource
>>>     identified by
>>>
>>>     http://www.sfs.uni-tuebingen.de/lsd/index.shtml
>>>
>>>     can cache that resource and use it e.g. for improving MT or
>>>     other (localization) workflows.
>>>
>>>
>>>     The conclusion from this is that from the providers of the
>>>     resources, we need to ask one of the following:
>>>
>>>     a) a stable URI for identification; resolving that URI should
>>>     give implementors of 2) the information they need for caching
>>>     the resource in an implementation specific manner.
>>>
>>>     b) that they allow W3C to provide the URI, like in the collation
>>>     example: it is W3C which hosts
>>>     http://www.w3.org/2005/xpath-functions/collation/codepoint/  ,
>>>     not the Unicode consortium that provides the codepoint list.
>>>
>>>     Which of a) or b) do people prefer?
>>>
>>>     Best,
>>>
>>>     Felix
>>>
>>>         Best,
>>>
>>>         Pedro
>>>
>>>         ------------------------------------------------------------------------
>>>
>>>         *De:*Dave Lewis [mailto:dave.lewis@cs.tcd.ie
>>>         <mailto:dave.lewis@cs.tcd.ie>]
>>>         *Enviado el:* jueves, 07 de junio de 2012 23:58
>>>         *Para:* public-multilingualweb-lt@w3.org
>>>         <mailto:public-multilingualweb-lt@w3.org>
>>>
>>>
>>>         *Asunto:* Re: [ACTION-94]: go and find examples of concept
>>>         ontology (semantic features of terms as opposed to domain
>>>         type ontologies)
>>>
>>>         Hi Tadej,
>>>         I spoke to some people from ISOCAT at LREC. They operate
>>>         persistent URL for their platform, so with an example
>>>         perhaps we could add that to the list?
>>>
>>>         cheers,
>>>         Dave
>>>
>>>         On 07/06/2012 15:19, Felix Sasaki wrote:
>>>
>>>         2012/6/7 Tadej Stajner <tadej.stajner@ijs.si
>>>         <mailto:tadej.stajner@ijs.si>>
>>>
>>>         Hi Felix,
>>>         as far as I'm aware, URIs only exist for the English
>>>         wordnet. Maybe prefixing the a # was not the best stylistic
>>>         choice here, but yes, what I meant to convey is that that
>>>         value was a local identifier, valid within a particular
>>>         semantic network.
>>>
>>>         In the ideal scenario, these selectors would be
>>>         dereferencible and verifiable via URIs for arbitrary
>>>         wordnets and terminology lexicons and their entries.
>>>
>>>         OK - the main point would be that they are dereferencible
>>>         and verifiable. In practice, you will not achieve that for
>>>         arbitrary wordnets, but you can achieve that for a subset,
>>>         if the related "players" agree. In the "collation" example
>>>         mentioned before, the identifier for the Unicode code point
>>>         based collation
>>>         http://www.w3.org/2005/xpath-functions/collation/codepoint/ was
>>>         the lowest common dominator; in addition to that everybody
>>>         is free to have other URIs for arbitrary collations. I would
>>>         hope that we could end up with such a list (hopefully longer
>>>         than one) for the semantic networks too.
>>>
>>>         Felix
>>>
>>>             Do we have any people involved in developing semantic
>>>             networks or term lexicons on this list? The compromise
>>>             is allowing some limited classes of non-URI local
>>>             selectors, like synset IDs for wordnets, and term IDs
>>>             for TBX lexicons.
>>>
>>>             -- Tadej
>>>
>>>
>>>
>>>             On 6/7/2012 3:44 PM, Felix Sasaki wrote:
>>>
>>>             Thanks, Tadej.
>>>
>>>             The value of the its-selector attribute looks like a
>>>             document internal link. But it is probably an identifier
>>>             of the synset in the given semantic network, no?
>>>
>>>             About 1) and 2): is your made-up example then the output
>>>             of the text annotation use case? I am asking since you
>>>             say "2) markup in raw ITS", so I'm not sure.
>>>
>>>             Also, it seems that an implementation needs to "know"
>>>             about the resources that are identified
>>>             via its-semantic-network-ref. This is really an
>>>             identifier, like
>>>
>>>             http://www.w3.org/2005/xpath-functions/collation/codepoint/
>>>
>>>             is an identifier for a Unicode code point collation; it
>>>             doesn't give you the collation data, but creating an
>>>             implementation that "understands" the identifier means
>>>             probably caching the collation data. The same would be
>>>             true for the semantic network.
>>>
>>>             This leads to the next question: can we engage the
>>>             developers of the semantic network (or other
>>>             disambiguation related) resources to come up with stable
>>>             URIs for these? It would be great to list these URIs in
>>>             our specification and say "this is how you identify the
>>>             English wordnet etc.", for scenarios like the collation
>>>             data mentioned above.
>>>
>>>             Felix
>>>
>>>             2012/6/7 Tadej Štajner <tadej.stajner@ijs.si
>>>             <mailto:tadej.stajner@ijs.si>>
>>>
>>>             Hi,
>>>
>>>             I agree with Pedro on the questions. Automatic word
>>>             sense disambiguation is in practice still not perfect,
>>>             so some semi-automatic user interfaces make a lot of
>>>             sense. And how I think that this could look like in a
>>>             made-up example, answering Felix's 1) and 2):
>>>
>>>             1) HTML+ITS: <span its-disambiguation
>>>             its-semantic-network-ref="http://www.sfs.uni-tuebingen.de/lsd/index.shtml"
>>>             <http://www.sfs.uni-tuebingen.de/lsd/index.shtml>
>>>             its-selector="#synset_loschen_3">löschen</span>
>>>
>>>             2) Markup in raw ITS
>>>             <its:disambiguation
>>>                
>>>             semanticNetworkRef="http://www.sfs.uni-tuebingen.de/lsd/index.shtml"
>>>             <http://www.sfs.uni-tuebingen.de/lsd/index.shtml>
>>>                
>>>             selector="#synset_loschen_3">löschen</its:disambiguation>
>>>
>>>             -- Tadej
>>>
>>>
>>>
>>>
>>>             On 04. 06. 2012 13 <tel:04.%2006.%202012%2013>:53, Pedro
>>>             L. Díez Orzas wrote:
>>>
>>>             Dear Felix,
>>>
>>>             Thank you very much. Probably Tadej can prepare the use
>>>             cases you mention, with the consolidated data category.
>>>             About the question 3 and 4, I can tell you the following:
>>>
>>>             3) Would it be produced also by an automatic text
>>>             annotation tool?
>>>
>>>             For the pointers to the three information referred
>>>             (concepts in Ontology, meanings in Lexical DB, and terms
>>>             in Terminological resources) I think it would be
>>>             possible semiautomatic annotation tools, that is,
>>>             proposed by the tool and confirmed by user.
>>>
>>>             The fully automatic text annotation  would need more
>>>             sophisticate “semantic calculus”, and most of these are
>>>             under research, as far as I know. Maybe, in this cases,
>>>             it should be combined with textAnalysisAnnotation,
>>>             specifying in *Annotation agent* – and *Confidence
>>>             score* – which system and with which reliability has
>>>             been produced.
>>>
>>>             4) Would 1-2 be consumed by an MT tool, or by other tools?
>>>
>>>             These can be basically consumed by language processing
>>>             tools, like MT, and other Linguistic Technology that
>>>             needs content or semantic info. For instance Text
>>>             Analytics, Semantic search, etc.. In the localization
>>>             chains, these information can be also used by automatic
>>>             or semiautomatic processes (like selection of
>>>             dictionaries for translations, or selection of
>>>             translators/revisers by subject area)
>>>
>>>             It could be also used by humans for translation or
>>>             post-edition in case of ambiguity or lake of context in
>>>             the content, but mostly by automatic systems.
>>>
>>>             I hope this helps.
>>>
>>>             Pedro
>>>
>>>             ------------------------------------------------------------------------
>>>
>>>             *De:*Felix Sasaki [mailto:fsasaki@w3.org]
>>>             *Enviado el:* sábado, 02 de junio de 2012 14:13
>>>             *Para:* Tadej Stajner; pedro.diez
>>>             *CC:* public-multilingualweb-lt@w3.org
>>>             <mailto:public-multilingualweb-lt@w3.org>
>>>             *Asunto:* Re: [ACTION-94]: go and find examples of
>>>             concept ontology (semantic features of terms as opposed
>>>             to domain type ontologies)
>>>
>>>             Hi Tadej, Pedro, all,
>>>
>>>             this looks like a great chain of producing and consuming
>>>             metadata.
>>>
>>>             Apologies if this was explained during last weeks call
>>>             or before, but can you clarify a bit the following:
>>>
>>>             1) How would the actual HTML markup produced in the
>>>             original text annotation use case look like?
>>>
>>>             2) How would the markup in this use case look like?
>>>
>>>             3) Would it be produced also by an automatic text
>>>             annotation tool?
>>>
>>>             4) Would 1-2 be consumed by an MT tool, or by other tools?
>>>
>>>             Thanks again,
>>>
>>>             Felix
>>>
>>>             2012/5/31 Tadej Stajner <tadej.stajner@ijs.si
>>>             <mailto:tadej.stajner@ijs.si>>
>>>
>>>             Hi Pedro,
>>>             thanks for the excellent explanation. If I understand
>>>             you correctly, a sufficient example for this use case
>>>             would be annotation of individual words with synset URI
>>>             of the appropriate wordnet? If so, then I believe this
>>>             route can be practical - I think linking to the synset
>>>             is a more practical idea than expressing semantic
>>>             features of the word given the available tools.
>>>
>>>             Enrycher can do automatic all-word disambiguation into
>>>             the english wordnet, whereas  we don't have anything
>>>             specific in place for semantic features (which I suspect
>>>             also holds for other text analytics providers).
>>>
>>>             I'm also in favor of prescribing wordnets for individual
>>>             languages as valid selector domains as you suggest in
>>>             option 1). That would make validation easier since we
>>>             have a known domain.
>>>
>>>             @All: Can we come up with a second implementation for
>>>             this use case, preferrably a consumer?
>>>
>>>             -- Tadej
>>>
>>>
>>>
>>>
>>>             On 5/29/2012 2:00 PM, Pedro L. Díez Orzas wrote:
>>>
>>>             Dear all,
>>>
>>>             Sorry for the delay. I tried to contact some people I
>>>             think can contribute to this, but they are not available
>>>             these weeks.
>>>
>>>             Before providing an example to consider all if it is
>>>             worthwhile to maintain “semantic selector” attribute in
>>>             the consolidation of “Disambiguation” I would like to do
>>>             a couple considerations:
>>>
>>>              1. Probably we will not have short term any
>>>                 implementation, but there are for example few
>>>                 semantic networks available in web (see
>>>                 http://www.globalwordnet.org/gwa/wordnet_table.html)
>>>                 that could be mapped using semantic selectors. See
>>>                 on line for example, the famous
>>>                 http://wordnetweb.princeton.edu
>>>                 <http://wordnetweb.princeton.edu/perl/webwn>).
>>>              2. The W3C working group SKOS (Simple Knowledge
>>>                 Organization System Reference) are maybe dealing
>>>                 with similar things.
>>>
>>>             The “semántica selector” allows further lexical (simple
>>>             words or multi words) distinctions than a “domain” or an
>>>             ontology like NERD. Also, the denotation is different
>>>             from the “concept reference”, most of all in part of
>>>             speech like verbs.
>>>
>>>             Within the same domain, referring to very similar
>>>             concepts, languages have semantic differences. Depending
>>>             on the semantic theory used, each tries to captivate
>>>             these differences by means of different systems
>>>             (semantic features, semantic primitives, semantic nodes
>>>             (in semantic networks), other semantic representations).
>>>             An example could be the German verb “löschen”, which in
>>>             different contexts can take different meanings that can
>>>             be try to capture using different selectors, with the
>>>             different systems.
>>>
>>>             –löschen                        -> clear            
>>>             (some bits)
>>>                                                -> delete          
>>>             (files)
>>>                                                -> cancel         
>>>             (programs)
>>>                                                -> erase           
>>>             (a scratchpad)
>>>                                                -> extinguish     (a
>>>             fire)
>>>
>>>             Other possible translations of the verb**“löschen” are:
>>>
>>>             delete
>>>
>>>              
>>>
>>>             löschen, streichen, tilgen, ausstreichen, herausstreichen
>>>
>>>             clear
>>>
>>>              
>>>
>>>             löschen, klären, klarmachen, leeren, räumen, säubern
>>>
>>>             erase
>>>
>>>              
>>>
>>>             löschen, auslöschen, tilgen, ausradieren, radieren,
>>>             abwischen
>>>
>>>             extinguish
>>>
>>>              
>>>
>>>             löschen, auslöschen, zerstören
>>>
>>>             quench
>>>
>>>              
>>>
>>>             löschen, stillen, abschrecken, dämpfen
>>>
>>>             put out
>>>
>>>              
>>>
>>>             löschen, bringen, ausmachen, ausschalten, treiben,
>>>             verstimmen
>>>
>>>             unload
>>>
>>>              
>>>
>>>             entladen, abladen, ausladen, löschen, abstoßen, abwälzen
>>>
>>>             discharge
>>>
>>>              
>>>
>>>             entladen, erfüllen, entlassen, entlasten, löschen, ausstoßen
>>>
>>>             wipe out
>>>
>>>              
>>>
>>>             auslöschen, löschen, ausrotten, tilgen, zunichte machen,
>>>             auswischen
>>>
>>>             slake
>>>
>>>              
>>>
>>>             stillen, löschen
>>>
>>>             close
>>>
>>>              
>>>
>>>             schließen, verschließen, abschließen, sperren, zumachen,
>>>             löschen
>>>
>>>             blot
>>>
>>>              
>>>
>>>             löschen, abtupfen, klecksen, beklecksen, sich unmöglich
>>>             machen, sich verderben
>>>
>>>             turn off
>>>
>>>              
>>>
>>>             ausschalten, abbiegen, abstellen, abdrehen, einbiegen,
>>>             löschen
>>>
>>>             blow out
>>>
>>>              
>>>
>>>             auspusten, löschen, aufblasen, aufblähen, aufbauschen,
>>>             platzen
>>>
>>>             zap
>>>
>>>              
>>>
>>>             abknallen, düsen, umschalten, löschen, töten, kaputtmachen
>>>
>>>             redeem
>>>
>>>              
>>>
>>>             einlösen, erlösen, zurückkaufen, tilgen, retten, löschen
>>>
>>>             pay off
>>>
>>>              
>>>
>>>             auszahlen, bezahlen, tilgen, abzahlen, abbezahlen, löschen
>>>
>>>             switch out
>>>
>>>              
>>>
>>>             löschen
>>>
>>>             unship
>>>
>>>              
>>>
>>>             ausladen, entladen, abnehmen, löschen
>>>
>>>             souse
>>>
>>>              
>>>
>>>             eintauchen, durchtränken, löschen, nass machen
>>>
>>>             rub off
>>>
>>>              
>>>
>>>             abreiben, abgehen, abwetzen, ausradieren, abscheuern,
>>>             löschen
>>>
>>>             strike off
>>>
>>>              
>>>
>>>             löschen
>>>
>>>             land
>>>
>>>              
>>>
>>>             landen, an Land gehen, kriegen, an Land ziehen,
>>>             aufsetzen, löschen
>>>
>>>             According to this, the consolidation of
>>>             disambiguation/namedEntity/  data categories under
>>>             “Terminology”
>>>             http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#disambiguation
>>>             could be the following. It is thought to cover
>>>             operational URI or XPath pointers to the current three
>>>             most important semantic resources: conceptual
>>>             (ontology), semantic (semantic networks or lexical
>>>             databases) and terminological (glossaries and
>>>             terminological resources), where ontologies are used for
>>>             both general lexicon and terminology, semantic networks
>>>             to represent general vocabulary (lexicon), and
>>>             terminological resources specialized vocabulary.
>>>
>>>             *disambiguation*
>>>
>>>             Includes data to be used by MT systems in disambiguating
>>>             difficult content
>>>
>>>             *Data model*
>>>
>>>               * concept reference: points to a *concept in an
>>>                 ontology* that this fragment of text represents. May
>>>                 be an URI or an XPath pointer.
>>>               * semantic selector: points to a *meaning in an
>>>                 semantic network* that this fragment of text
>>>                 represents. May be an URI or an XPath pointer.
>>>               * terminology reference: points to *a term in a
>>>                 terminological resource* that this fragment of text
>>>                 represents. May be an URI or an XPath pointer.
>>>               * equivalent translation: expressions of that concept
>>>                 in other languages, for example for training MT systems
>>>
>>>             Also, I would keep *textAnalysisAnnotation*, since the
>>>             purpose is quite different.
>>>
>>>             Anyway, if we consider not to include “semantic
>>>             selector” now, maybe it can be for future versions or to
>>>             be treated in liaison with other groups.
>>>
>>>             I hope it helps,
>>>
>>>             Pedro
>>>
>>>             *__________________________________*
>>>
>>>             **
>>>
>>>             *Pedro L. Díez Orzas*
>>>
>>>             *Presidente Ejecutivo/CEO*
>>>
>>>             *Linguaserve Internacionalización de Servicios, S.A.*
>>>
>>>             *Tel.: +34 91 761 64 60 <tel:%2B34%2091%20761%2064%2060>
>>>             Fax: +34 91 542 89 28 <tel:%2B34%2091%20542%2089%2028> *
>>>
>>>             *E-mail: **pedro.diez@linguaserve.com
>>>             <mailto:pedro.diez@linguaserve.com>*
>>>
>>>             *www.linguaserve.com <http://www.linguaserve.com/>*
>>>
>>>             **
>>>
>>>             «En cumplimiento con lo previsto con los artículos 21 y
>>>             22 de la Ley 34/2002, de 11 de julio, de Servicios de la
>>>             Sociedad de Información y Comercio Electrónico, le
>>>             informamos que procederemos al archivo y tratamiento de
>>>             sus datos exclusivamente con fines de promoción de los
>>>             productos y servicios ofrecidos por LINGUASERVE
>>>             INTERNACIONALIZACIÓN DE SERVICIOS, S.A. En caso de que
>>>             Vdes. no deseen que procedamos al archivo y tratamiento
>>>             de los datos proporcionados, o no deseen recibir
>>>             comunicaciones comerciales sobre los productos y
>>>             servicios ofrecidos, comuníquenoslo a
>>>             clients@linguaserve.com
>>>             <mailto:clients@linguaserve.com>, y su petición será
>>>             inmediatamente cumplida.»
>>>
>>>             "According to the provisions set forth in articles 21
>>>             and 22 of Law 34/2002 of July 11 regarding Information
>>>             Society and eCommerce Services, we will store and use
>>>             your personal data with the sole purpose of marketing
>>>             the products and services offered by LINGUASERVE
>>>             INTERNACIONALIZACIÓN DE SERVICIOS, S.A. If you do not
>>>             wish your personal data to be stored and handled, or you
>>>             do not wish to receive further information regarding
>>>             products and services offered by our company, please
>>>             e-mail us to clients@linguaserve.com
>>>             <mailto:clients@linguaserve.com>. Your request will be
>>>             processed immediately."
>>>
>>>             *____________________________________*
>>>
>>>
>>>
>>>             -- 
>>>             Felix Sasaki
>>>
>>>             DFKI / W3C Fellow
>>>
>>>
>>>
>>>             -- 
>>>             Felix Sasaki
>>>
>>>             DFKI / W3C Fellow
>>>
>>>
>>>
>>>         -- 
>>>         Felix Sasaki
>>>
>>>         DFKI / W3C Fellow
>>>
>>>
>>>
>>>
>>>     -- 
>>>     Felix Sasaki
>>>     DFKI / W3C Fellow
>>>
>>
>>
>>
>>
>> -- 
>> Felix Sasaki
>> DFKI / W3C Fellow
>>
Received on Monday, 18 June 2012 11:15:44 UTC