W3C home > Mailing lists > Public > public-multilingualweb-lt-comments@w3.org > February 2013

Re: [ISSUE-109]: disambiguation ITS 2.0 requirements w.r.t Indian [Indic] languages [ACTION-418]

From: Dave Lewis <dave.lewis@cs.tcd.ie>
Date: Mon, 11 Feb 2013 15:00:55 +0000
Message-ID: <511907A7.3040709@cs.tcd.ie>
To: Somnath Chandra <schandra@deity.gov.in>
CC: slata@mit.gov.in, public-multilingualweb-lt-comments@w3.org
Hi Somnath,
thanks you for your response. Your input into best practices would be 
warmly weclome by the group. i think this would split into two potential 
parts;
1) producing a best practice for NIF usage in the context of typical 
workflows that would use ITS
2) if during (1) the expressiveness of NIF was found wanting, you may 
want to engage directly with the NIF community.

I'd ask Felix and Sebastien Hellman to also comment on the best route to 
advancing ITS2.0 best practice in this area, - Felix we don't currently 
have a stub for best practice in relation to NIF on the wiki, so should 
we start one?

Regards,
Dave

On 07/02/2013 10:06, Somnath Chandra wrote:
> Hello Lewis,
> Thanks a lot for your feedback. We have studied the NIF encoding 
> and Indian Languages requirements for Hierarchical Annotation need to 
> be incorporated in details in NIF.
>
> As defined in NIF Version 
> (http://nlp2rdf.org/nif-1-0#toc-part-of-speech-tags ) , Part of speech 
> tags should make use of Ontologies of Linguistic Annotations (OLiA) . 
> OLiA connects local annotation tag sets with a global reference 
> ontology. Therefore it allows to keep the specific part of speech tag 
> at a fine granularity, while at the same time having a coarse grained 
> reference model.
>
> OLiA defines OLiA Annotation Models for morphology, morphosyntax and 
> syntax for multilingual.
>
> However there are three Multilingual Annotation Models for 
> morphological, morphosyntactic and syntactic annotation for Indian 
> langauges i.e L-POSTS tagset Baskaran et al. (2008) , AnnCorra,Bharati 
> et al. (2006), IIIT tagset,IIT (2007).
>
>       We are in  process of defining a common POS tagset for Indic 
> languages , based on W3C Internationalization best practices. The 
> draft standard has been developed and is under process of testing and 
> evaluation. Once finalized, the above three POS tagsets would be 
> replaced by this national standard , which may be incorporated in NIF.
>
>      We would definitely actively participate in developing the best 
> practices for use of ITS with external NIF models with the W3C team.
>
>      With regards,
>
> Dr. Somnath Chandra
> Scientist-E & Dy. Country Manager W3C India
> Dept. of Electronics & Information Technology
> Ministry of Communications & Information Technology
> Govt. of India
> Tel:+91-11-24364744,24301856
> Fax: +91-11-24363099
> e-mail :schandra@mit.gov.in
>
> On 02/04/13, *Dave Lewis *<dave.lewis@cs.tcd.ie> wrote:
>>
>> Hi Somnath,
>> I wanted to follow up on this comment also. Do you have any comments 
>> on our response, was it satisfactory? If we hear from you to the 
>> contrary we will assume you are satisfied and aim to close ISSUE-109 
>> on the 11th February.
>>
>> Kind Regards,
>> Dave
>>
>> On 28/01/2013 00:46, Dave Lewis wrote:
>>> Hi Somnath,
>>> I wanted to update you of the status of ISSUE-109, related to 
>>> disambiguation.
>>>
>>> We discussed this at the WG face to face meeting last week, see:
>>> http://www.w3.org/2013/01/23-mlw-lt-minutes.html#item37.
>>>
>>> The consensus was that hierarchical annotation for disambiguation 
>>> was difficult to achieve technically. As you point out, ITS override 
>>> rule mean that any hierarchical annotation has to be supported 
>>> explicitly with special attributes. However, doing this in a generic 
>>> way is difficult  as we may also need to support multiple different 
>>> annotations of the same text, and therefore map sub-annotations to 
>>> specific parent ones.
>>>
>>> You may have seen that there has been extensive discussion on 
>>> potentially merging the terminology and disambiguation data categories:
>>> http://www.w3.org/2013/01/24-mlw-lt-minutes.html#item03
>>> and
>>> https://www.w3.org/International/multilingualweb/lt/track/issues/68
>>>
>>> At the meeting we asked the experts involved is considering 
>>> technical solutions to this to also address your hierarchical 
>>> annotation requirement, but this yielded no usable technical solution.
>>>
>>> We therefore propose to reject this suggested change.
>>>
>>> We would however point out  that the external NIF encoding (see 
>>> http://nlp2rdf.org/) would be better suited to capturing such 
>>> hierarchical annotations. We would welcome you input therefore in 
>>> formulating best practice for the use of ITS with external NIF models.
>>>
>>> Please let us know if you are satisfied with this response.
>>>
>>> I look forward to hearing from you.
>>> Regards,
>>> Dave Lewis
>>>
>>>
>>> On 21/01/2013 15:02, Dave Lewis wrote:
>>>> Hi,
>>>> To speed the resolution of the different issues in your original 
>>>> post I'll restricted ISSUE-84 to comments about the translate data 
>>>> category and raised two new issues:
>>>> ISSUE-108: locNote ITS 2.0 requirements w.r.t Indian [Indic] languages
>>>> ISSUE-109: disambiguation ITS 2.0 requirements w.r.t Indian [Indic] 
>>>> languages
>>>>
>>>> Regards,
>>>> Dave
>>>>
>>>> On 18/01/2013 12:46, Dr. David Filip wrote:
>>>>> Hi all, this comment is now associated with Issue-84
>>>>> Rgds
>>>>> dF
>>>>>
>>>>> Dr. David Filip
>>>>> =======================
>>>>> LRC | CNGL | LT-Web | CSIS
>>>>> University of Limerick, Ireland
>>>>> telephone: +353-6120-2781
>>>>> *cellphone: +353-86-0222-158*
>>>>> facsimile: +353-6120-2734
>>>>> mailto: david.filip@ul.ie <mailto:david.filip@ul.ie>
>>>>>
>>>>>
>>>>> On Wed, Jan 16, 2013 at 8:52 AM, Felix Sasaki <fsasaki@w3.org 
>>>>> <mailto:fsasaki@w3.org>> wrote:
>>>>>
>>>>>     Forwarded on behalf of Somnath Chandra (by permission), with
>>>>>     CC to Somnath and Svaran Lata.The comments are not yet in
>>>>>     tracker. See also new comments (also not in tracker yet) on
>>>>>     the www-international list at
>>>>>     http://lists.w3.org/Archives/Public/www-international/2013JanMar/0065.html
>>>>>
>>>>>     If you have input for replying to the comments, please provide
>>>>>     it on our comments list (but feel free to put others in CC to
>>>>>     speed up the process).
>>>>>
>>>>>     Best,
>>>>>
>>>>>     Felix
>>>>>
>>>>>
>>>>>     -------- Original-Nachricht --------
>>>>>     Betreff: 	Fwd: ITS 2.0 requirements w.r.t Indian languages
>>>>>     Datum: 	Wed, 16 Jan 2013 13:46:52 +0530
>>>>>     Von: 	Somnath Chandra <schandra@deity.gov.in>
>>>>>     <mailto:schandra@deity.gov.in>
>>>>>     An: 	Felix Sasaki <fsasaki@w3.org> <mailto:fsasaki@w3.org>
>>>>>     Kopie (CC): 	slata <slata@mit.gov.in> <mailto:slata@mit.gov.in>
>>>>>
>>>>>
>>>>>
>>>>>     Dear Dr. Felix Sasaki,
>>>>>     W3C India has compiled the Indic Languages requirements for
>>>>>     ITS 2.0. Kindly find enclosed the draft document developed for
>>>>>     the purpose.
>>>>>     Submitted for kind perusal. Please feel free to contact me for
>>>>>     any further clarifications / discussions.
>>>>>     With best regards,
>>>>>     Somnath , W3C India
>>>>>     Dr. Somnath Chandra
>>>>>     Joint Director and Dy. Country Manager , W3C India
>>>>>     Dept. of Electronics & Information Technology
>>>>>     Ministry of Communications & Information Technology
>>>>>     Govt. of India
>>>>>     Tel:+91-11-24364744,24301811 <tel:+91-11-24364744,24301811>
>>>>>     Fax: +91-11-24363099 <tel:%2B91-11-24363099>
>>>>>     e-mail :schandra@mit.gov.in <mailto:schandra@mit.gov.in>
>>>>>     -------- Original Message --------
>>>>>     From: *Prashant Verma *<vermaprashant1@gmail.com>
>>>>>     <mailto:vermaprashant1@gmail.com>
>>>>>     Date: Jan 10, 2013 2:20:32 PM
>>>>>     Subject: ITS 2.0 requirements w.r.t Indian languages
>>>>>     To: schandra@mit.gov.in <mailto:schandra@mit.gov.in>
>>>>>
>>>>>
>>>>>     -- 
>>>>>
>>>>>     Prashant Verma I  Sr. Software Engineer
>>>>>     W3C India
>>>>>     New Delhi
>>>>>     Cell : +91-8800521042 <tel:%2B91-8800521042>
>>>>>     Website : http://www.w3cindia.in <http://www.w3cindia.in/>
>>>>>     --
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
> --
Received on Monday, 11 February 2013 15:00:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 11 February 2013 15:00:32 GMT