W3C home > Mailing lists > Public > public-multilingualweb-lt-comments@w3.org > February 2013

Re: [ISSUE-109]: disambiguation ITS 2.0 requirements w.r.t Indian [Indic] languages [ACTION-418]

From: Felix Sasaki <fsasaki@w3.org>
Date: Tue, 12 Feb 2013 09:53:42 +0100
Message-ID: <511A0316.80908@w3.org>
To: Somnath Chandra <schandra@deity.gov.in>
CC: Dave Lewis <dave.lewis@cs.tcd.ie>, slata@mit.gov.in, public-multilingualweb-lt-comments@w3.org, Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>, Manoj Jain <mjain@deity.gov.in>
Thanks, Somnath. FYI, I have subscribed you to the 
public-multilingualweb-lt-comments list, so that your mails get through. 
At moment I cannot give more directions than to propose you to start 
writing :) Would it be ok for you to join the working group, to move 
things forward easier?

Best,

Felix

Am 12.02.13 07:02, schrieb Somnath Chandra:
> Hello Felix, Dave and Others,
> Thanks for your very encouraging respose. Your ideas are appropriate 
> to take up the matter in a  timebound way. We have already mobilized 
> the language technology researchers in India and would develop the 
> best pratices document quickly and would be in active collaboration 
> with you all.  While developing the best practices , we shall also 
> capture the linguistic nuances for different languages (22 
> constitutionally recognized Indian languages) and their requirements.
> Looking forward for your further direction.
> With best regards,
> Somnath
> On 02/12/13, *Felix Sasaki *<fsasaki@w3.org> wrote:
>>
>> Hi Dave, all,
>>
>> Am 11.02.13 16:00, schrieb Dave Lewis:
>>> Hi Somnath,
>>> thanks you for your response. Your input into best practices would 
>>> be warmly weclome by the group. i think this would split into two 
>>> potential parts;
>>> 1) producing a best practice for NIF usage in the context of typical 
>>> workflows that would use ITS
>>> 2) if during (1) the expressiveness of NIF was found wanting, you 
>>> may want to engage directly with the NIF community.
>>>
>>> I'd ask Felix and Sebastien Hellman to also comment on the best 
>>> route to advancing ITS2.0 best practice in this area, - Felix we 
>>> don't currently have a stub for best practice in relation to NIF on 
>>> the wiki, so should we start one?
>>
>> Mostly for Somnath et al.: As discussed in the group call today, that 
>> is just a question of manpower. At
>> http://www.w3.org/International/multilingualweb/lt/wiki/Main_Page#Draft_documents_and_time_line
>> we have linked to potential "best practices documents, see
>> http://www.w3.org/International/multilingualweb/lt/wiki/Best_Practice_Documents
>> Sure we can add a NIF document here. We just need a volunteer to 
>> write it. So Somnath, if that is of interest for you, let us know. It 
>> might also make sense to involve Sebastian Hellmann - putting him 
>> here in CC.
>>
>> Best,
>>
>> Felix
>>
>>>
>>> Regards,
>>> Dave
>>>
>>> On 07/02/2013 10:06, Somnath Chandra wrote:
>>>> Hello Lewis,
>>>> Thanks a lot for your feedback. We have studied the NIF encoding 
>>>> and Indian Languages requirements for Hierarchical Annotation need 
>>>> to be incorporated in details in NIF.
>>>>
>>>> As defined in NIF Version 
>>>> (http://nlp2rdf.org/nif-1-0#toc-part-of-speech-tags ) , Part of 
>>>> speech tags should make use of Ontologies of Linguistic Annotations 
>>>> (OLiA) . OLiA connects local annotation tag sets with a global 
>>>> reference ontology. Therefore it allows to keep the specific part 
>>>> of speech tag at a fine granularity, while at the same time having 
>>>> a coarse grained reference model.
>>>>
>>>> OLiA defines OLiA Annotation Models for morphology, morphosyntax 
>>>> and syntax for multilingual.
>>>>
>>>> However there are three Multilingual Annotation Models for 
>>>> morphological, morphosyntactic and syntactic annotation for Indian 
>>>> langauges i.e L-POSTS tagset Baskaran et al. (2008) , 
>>>> AnnCorra,Bharati et al. (2006), IIIT tagset,IIT (2007).
>>>>
>>>>       We are in  process of defining a common POS tagset for Indic 
>>>> languages , based on W3C Internationalization best practices. The 
>>>> draft standard has been developed and is under process of testing 
>>>> and evaluation. Once finalized, the above three POS tagsets would 
>>>> be replaced by this national standard , which may be incorporated 
>>>> in NIF.
>>>>
>>>>      We would definitely actively participate in developing the 
>>>> best practices for use of ITS with external NIF models with the W3C 
>>>> team.
>>>>
>>>>      With regards,
>>>>
>>>> Dr. Somnath Chandra
>>>> Scientist-E & Dy. Country Manager W3C India
>>>> Dept. of Electronics & Information Technology
>>>> Ministry of Communications & Information Technology
>>>> Govt. of India
>>>> Tel:+91-11-24364744,24301856
>>>> Fax: +91-11-24363099
>>>> e-mail :schandra@mit.gov.in
>>>>
>>>> On 02/04/13, *Dave Lewis *<dave.lewis@cs.tcd.ie> 
>>>> <mailto:dave.lewis@cs.tcd.ie> wrote:
>>>>>
>>>>> Hi Somnath,
>>>>> I wanted to follow up on this comment also. Do you have any 
>>>>> comments on our response, was it satisfactory? If we hear from you 
>>>>> to the contrary we will assume you are satisfied and aim to close 
>>>>> ISSUE-109 on the 11th February.
>>>>>
>>>>> Kind Regards,
>>>>> Dave
>>>>>
>>>>> On 28/01/2013 00:46, Dave Lewis wrote:
>>>>>> Hi Somnath,
>>>>>> I wanted to update you of the status of ISSUE-109, related to 
>>>>>> disambiguation.
>>>>>>
>>>>>> We discussed this at the WG face to face meeting last week, see:
>>>>>> http://www.w3.org/2013/01/23-mlw-lt-minutes.html#item37.
>>>>>>
>>>>>> The consensus was that hierarchical annotation for disambiguation 
>>>>>> was difficult to achieve technically. As you point out, ITS 
>>>>>> override rule mean that any hierarchical annotation has to be 
>>>>>> supported explicitly with special attributes. However, doing this 
>>>>>> in a generic way is difficult as we may also need to support 
>>>>>> multiple different annotations of the same text, and therefore 
>>>>>> map sub-annotations to specific parent ones.
>>>>>>
>>>>>> You may have seen that there has been extensive discussion on 
>>>>>> potentially merging the terminology and disambiguation data 
>>>>>> categories:
>>>>>> http://www.w3.org/2013/01/24-mlw-lt-minutes.html#item03
>>>>>> and
>>>>>> https://www.w3.org/International/multilingualweb/lt/track/issues/68
>>>>>>
>>>>>> At the meeting we asked the experts involved is considering 
>>>>>> technical solutions to this to also address your hierarchical 
>>>>>> annotation requirement, but this yielded no usable technical 
>>>>>> solution.
>>>>>>
>>>>>> We therefore propose to reject this suggested change.
>>>>>>
>>>>>> We would however point out  that the external NIF encoding (see 
>>>>>> http://nlp2rdf.org/) would be better suited to capturing such 
>>>>>> hierarchical annotations. We would welcome you input therefore in 
>>>>>> formulating best practice for the use of ITS with external NIF 
>>>>>> models.
>>>>>>
>>>>>> Please let us know if you are satisfied with this response.
>>>>>>
>>>>>> I look forward to hearing from you.
>>>>>> Regards,
>>>>>> Dave Lewis
>>>>>>
>>>>>>
>>>>>> On 21/01/2013 15:02, Dave Lewis wrote:
>>>>>>> Hi,
>>>>>>> To speed the resolution of the different issues in your original 
>>>>>>> post I'll restricted ISSUE-84 to comments about the translate 
>>>>>>> data category and raised two new issues:
>>>>>>> ISSUE-108: locNote ITS 2.0 requirements w.r.t Indian [Indic] 
>>>>>>> languages
>>>>>>> ISSUE-109: disambiguation ITS 2.0 requirements w.r.t Indian 
>>>>>>> [Indic] languages
>>>>>>>
>>>>>>> Regards,
>>>>>>> Dave
>>>>>>>
>>>>>>> On 18/01/2013 12:46, Dr. David Filip wrote:
>>>>>>>> Hi all, this comment is now associated with Issue-84
>>>>>>>> Rgds
>>>>>>>> dF
>>>>>>>>
>>>>>>>> Dr. David Filip
>>>>>>>> =======================
>>>>>>>> LRC | CNGL | LT-Web | CSIS
>>>>>>>> University of Limerick, Ireland
>>>>>>>> telephone: +353-6120-2781
>>>>>>>> *cellphone: +353-86-0222-158*
>>>>>>>> facsimile: +353-6120-2734
>>>>>>>> mailto: david.filip@ul.ie <mailto:david.filip@ul.ie>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jan 16, 2013 at 8:52 AM, Felix Sasaki <fsasaki@w3.org 
>>>>>>>> <mailto:fsasaki@w3.org>> wrote:
>>>>>>>>
>>>>>>>>     Forwarded on behalf of Somnath Chandra (by permission),
>>>>>>>>     with CC to Somnath and Svaran Lata.The comments are not yet
>>>>>>>>     in tracker. See also new comments (also not in tracker yet)
>>>>>>>>     on the www-international list at
>>>>>>>>     http://lists.w3.org/Archives/Public/www-international/2013JanMar/0065.html
>>>>>>>>
>>>>>>>>     If you have input for replying to the comments, please
>>>>>>>>     provide it on our comments list (but feel free to put
>>>>>>>>     others in CC to speed up the process).
>>>>>>>>
>>>>>>>>     Best,
>>>>>>>>
>>>>>>>>     Felix
>>>>>>>>
>>>>>>>>
>>>>>>>>     -------- Original-Nachricht --------
>>>>>>>>     Betreff: 	Fwd: ITS 2.0 requirements w.r.t Indian languages
>>>>>>>>     Datum: 	Wed, 16 Jan 2013 13:46:52 +0530
>>>>>>>>     Von: 	Somnath Chandra <schandra@deity.gov.in>
>>>>>>>>     <mailto:schandra@deity.gov.in>
>>>>>>>>     An: 	Felix Sasaki <fsasaki@w3.org> <mailto:fsasaki@w3.org>
>>>>>>>>     Kopie (CC): 	slata <slata@mit.gov.in>
>>>>>>>>     <mailto:slata@mit.gov.in>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>     Dear Dr. Felix Sasaki,
>>>>>>>>     W3C India has compiled the Indic Languages requirements for
>>>>>>>>     ITS 2.0. Kindly find enclosed the draft document developed
>>>>>>>>     for the purpose.
>>>>>>>>     Submitted for kind perusal. Please feel free to contact me
>>>>>>>>     for any further clarifications / discussions.
>>>>>>>>     With best regards,
>>>>>>>>     Somnath , W3C India
>>>>>>>>     Dr. Somnath Chandra
>>>>>>>>     Joint Director and Dy. Country Manager , W3C India
>>>>>>>>     Dept. of Electronics & Information Technology
>>>>>>>>     Ministry of Communications & Information Technology
>>>>>>>>     Govt. of India
>>>>>>>>     Tel:+91-11-24364744,24301811 <tel:+91-11-24364744,24301811>
>>>>>>>>     Fax: +91-11-24363099 <tel:%2B91-11-24363099>
>>>>>>>>     e-mail :schandra@mit.gov.in <mailto:schandra@mit.gov.in>
>>>>>>>>     -------- Original Message --------
>>>>>>>>     From: *Prashant Verma *<vermaprashant1@gmail.com>
>>>>>>>>     <mailto:vermaprashant1@gmail.com>
>>>>>>>>     Date: Jan 10, 2013 2:20:32 PM
>>>>>>>>     Subject: ITS 2.0 requirements w.r.t Indian languages
>>>>>>>>     To: schandra@mit.gov.in <mailto:schandra@mit.gov.in>
>>>>>>>>
>>>>>>>>
>>>>>>>>     -- 
>>>>>>>>
>>>>>>>>     Prashant Verma I  Sr. Software Engineer
>>>>>>>>     W3C India
>>>>>>>>     New Delhi
>>>>>>>>     Cell : +91-8800521042 <tel:%2B91-8800521042>
>>>>>>>>     Website : http://www.w3cindia.in <http://www.w3cindia.in/>
>>>>>>>>     --
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>> --
>>>
>>
> --
> Dr. Somnath Chandra
> Scientist-E
> Dept. of Electronics & Information Technology
> Ministry of Communications & Information Technology
> Govt. of India
> Tel:+91-11-24364744,24301856
> Fax: +91-11-24363099
> e-mail :schandra@mit.gov.in
Received on Tuesday, 12 February 2013 08:54:13 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 12 February 2013 08:54:13 GMT