W3C home > Mailing lists > Public > public-multilingualweb-lt-comments@w3.org > February 2013

Re: [ISSUE-109]: disambiguation ITS 2.0 requirements w.r.t Indian [Indic] languages [ACTION-418]

From: Felix Sasaki <fsasaki@w3.org>
Date: Mon, 11 Feb 2013 21:39:28 +0100
Message-ID: <51195700.9020806@w3.org>
To: Dave Lewis <dave.lewis@cs.tcd.ie>
CC: Somnath Chandra <schandra@deity.gov.in>, slata@mit.gov.in, public-multilingualweb-lt-comments@w3.org, Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
Hi Dave, all,

Am 11.02.13 16:00, schrieb Dave Lewis:
> Hi Somnath,
> thanks you for your response. Your input into best practices would be 
> warmly weclome by the group. i think this would split into two 
> potential parts;
> 1) producing a best practice for NIF usage in the context of typical 
> workflows that would use ITS
> 2) if during (1) the expressiveness of NIF was found wanting, you may 
> want to engage directly with the NIF community.
>
> I'd ask Felix and Sebastien Hellman to also comment on the best route 
> to advancing ITS2.0 best practice in this area, - Felix we don't 
> currently have a stub for best practice in relation to NIF on the 
> wiki, so should we start one?

Mostly for Somnath et al.: As discussed in the group call today, that is 
just a question of manpower. At
http://www.w3.org/International/multilingualweb/lt/wiki/Main_Page#Draft_documents_and_time_line
we have linked to potential "best practices documents, see
http://www.w3.org/International/multilingualweb/lt/wiki/Best_Practice_Documents
Sure we can add a NIF document here. We just need a volunteer to write 
it. So Somnath, if that is of interest for you, let us know. It might 
also make sense to involve Sebastian Hellmann - putting him here in CC.

Best,

Felix

>
> Regards,
> Dave
>
> On 07/02/2013 10:06, Somnath Chandra wrote:
>> Hello Lewis,
>> Thanks a lot for your feedback. We have studied the NIF encoding 
>> and Indian Languages requirements for Hierarchical Annotation need to 
>> be incorporated in details in NIF.
>>
>> As defined in NIF Version 
>> (http://nlp2rdf.org/nif-1-0#toc-part-of-speech-tags ) , Part of 
>> speech tags should make use of Ontologies of Linguistic Annotations 
>> (OLiA) . OLiA connects local annotation tag sets with a global 
>> reference ontology. Therefore it allows to keep the specific part of 
>> speech tag at a fine granularity, while at the same time having a 
>> coarse grained reference model.
>>
>> OLiA defines OLiA Annotation Models for morphology, morphosyntax and 
>> syntax for multilingual.
>>
>> However there are three Multilingual Annotation Models for 
>> morphological, morphosyntactic and syntactic annotation for Indian 
>> langauges i.e L-POSTS tagset Baskaran et al. (2008) , 
>> AnnCorra,Bharati et al. (2006), IIIT tagset,IIT (2007).
>>
>> We are in  process of defining a common POS tagset for Indic 
>> languages , based on W3C Internationalization best practices. The 
>> draft standard has been developed and is under process of testing and 
>> evaluation. Once finalized, the above three POS tagsets would be 
>> replaced by this national standard , which may be incorporated in NIF.
>>
>>      We would definitely actively participate in developing the best 
>> practices for use of ITS with external NIF models with the W3C team.
>>
>> With regards,
>>
>> Dr. Somnath Chandra
>> Scientist-E & Dy. Country Manager W3C India
>> Dept. of Electronics & Information Technology
>> Ministry of Communications & Information Technology
>> Govt. of India
>> Tel:+91-11-24364744,24301856
>> Fax: +91-11-24363099
>> e-mail :schandra@mit.gov.in
>>
>> On 02/04/13, *Dave Lewis *<dave.lewis@cs.tcd.ie> wrote:
>>>
>>> Hi Somnath,
>>> I wanted to follow up on this comment also. Do you have any comments 
>>> on our response, was it satisfactory? If we hear from you to the 
>>> contrary we will assume you are satisfied and aim to close ISSUE-109 
>>> on the 11th February.
>>>
>>> Kind Regards,
>>> Dave
>>>
>>> On 28/01/2013 00:46, Dave Lewis wrote:
>>>> Hi Somnath,
>>>> I wanted to update you of the status of ISSUE-109, related to 
>>>> disambiguation.
>>>>
>>>> We discussed this at the WG face to face meeting last week, see:
>>>> http://www.w3.org/2013/01/23-mlw-lt-minutes.html#item37.
>>>>
>>>> The consensus was that hierarchical annotation for disambiguation 
>>>> was difficult to achieve technically. As you point out, ITS 
>>>> override rule mean that any hierarchical annotation has to be 
>>>> supported explicitly with special attributes. However, doing this 
>>>> in a generic way is difficult  as we may also need to support 
>>>> multiple different annotations of the same text, and therefore map 
>>>> sub-annotations to specific parent ones.
>>>>
>>>> You may have seen that there has been extensive discussion on 
>>>> potentially merging the terminology and disambiguation data categories:
>>>> http://www.w3.org/2013/01/24-mlw-lt-minutes.html#item03
>>>> and
>>>> https://www.w3.org/International/multilingualweb/lt/track/issues/68
>>>>
>>>> At the meeting we asked the experts involved is considering 
>>>> technical solutions to this to also address your hierarchical 
>>>> annotation requirement, but this yielded no usable technical solution.
>>>>
>>>> We therefore propose to reject this suggested change.
>>>>
>>>> We would however point out  that the external NIF encoding (see 
>>>> http://nlp2rdf.org/) would be better suited to capturing such 
>>>> hierarchical annotations. We would welcome you input therefore in 
>>>> formulating best practice for the use of ITS with external NIF models.
>>>>
>>>> Please let us know if you are satisfied with this response.
>>>>
>>>> I look forward to hearing from you.
>>>> Regards,
>>>> Dave Lewis
>>>>
>>>>
>>>> On 21/01/2013 15:02, Dave Lewis wrote:
>>>>> Hi,
>>>>> To speed the resolution of the different issues in your original 
>>>>> post I'll restricted ISSUE-84 to comments about the translate data 
>>>>> category and raised two new issues:
>>>>> ISSUE-108: locNote ITS 2.0 requirements w.r.t Indian [Indic] languages
>>>>> ISSUE-109: disambiguation ITS 2.0 requirements w.r.t Indian 
>>>>> [Indic] languages
>>>>>
>>>>> Regards,
>>>>> Dave
>>>>>
>>>>> On 18/01/2013 12:46, Dr. David Filip wrote:
>>>>>> Hi all, this comment is now associated with Issue-84
>>>>>> Rgds
>>>>>> dF
>>>>>>
>>>>>> Dr. David Filip
>>>>>> =======================
>>>>>> LRC | CNGL | LT-Web | CSIS
>>>>>> University of Limerick, Ireland
>>>>>> telephone: +353-6120-2781
>>>>>> *cellphone: +353-86-0222-158*
>>>>>> facsimile: +353-6120-2734
>>>>>> mailto: david.filip@ul.ie <mailto:david.filip@ul.ie>
>>>>>>
>>>>>>
>>>>>> On Wed, Jan 16, 2013 at 8:52 AM, Felix Sasaki <fsasaki@w3.org 
>>>>>> <mailto:fsasaki@w3.org>> wrote:
>>>>>>
>>>>>>     Forwarded on behalf of Somnath Chandra (by permission), with
>>>>>>     CC to Somnath and Svaran Lata.The comments are not yet in
>>>>>>     tracker. See also new comments (also not in tracker yet) on
>>>>>>     the www-international list at
>>>>>>     http://lists.w3.org/Archives/Public/www-international/2013JanMar/0065.html
>>>>>>
>>>>>>     If you have input for replying to the comments, please
>>>>>>     provide it on our comments list (but feel free to put others
>>>>>>     in CC to speed up the process).
>>>>>>
>>>>>>     Best,
>>>>>>
>>>>>>     Felix
>>>>>>
>>>>>>
>>>>>>     -------- Original-Nachricht --------
>>>>>>     Betreff: 	Fwd: ITS 2.0 requirements w.r.t Indian languages
>>>>>>     Datum: 	Wed, 16 Jan 2013 13:46:52 +0530
>>>>>>     Von: 	Somnath Chandra <schandra@deity.gov.in>
>>>>>>     <mailto:schandra@deity.gov.in>
>>>>>>     An: 	Felix Sasaki <fsasaki@w3.org> <mailto:fsasaki@w3.org>
>>>>>>     Kopie (CC): 	slata <slata@mit.gov.in> <mailto:slata@mit.gov.in>
>>>>>>
>>>>>>
>>>>>>
>>>>>>     Dear Dr. Felix Sasaki,
>>>>>>     W3C India has compiled the Indic Languages requirements for
>>>>>>     ITS 2.0. Kindly find enclosed the draft document developed
>>>>>>     for the purpose.
>>>>>>     Submitted for kind perusal. Please feel free to contact me
>>>>>>     for any further clarifications / discussions.
>>>>>>     With best regards,
>>>>>>     Somnath , W3C India
>>>>>>     Dr. Somnath Chandra
>>>>>>     Joint Director and Dy. Country Manager , W3C India
>>>>>>     Dept. of Electronics & Information Technology
>>>>>>     Ministry of Communications & Information Technology
>>>>>>     Govt. of India
>>>>>>     Tel:+91-11-24364744,24301811 <tel:+91-11-24364744,24301811>
>>>>>>     Fax: +91-11-24363099 <tel:%2B91-11-24363099>
>>>>>>     e-mail :schandra@mit.gov.in <mailto:schandra@mit.gov.in>
>>>>>>     -------- Original Message --------
>>>>>>     From: *Prashant Verma *<vermaprashant1@gmail.com>
>>>>>>     <mailto:vermaprashant1@gmail.com>
>>>>>>     Date: Jan 10, 2013 2:20:32 PM
>>>>>>     Subject: ITS 2.0 requirements w.r.t Indian languages
>>>>>>     To: schandra@mit.gov.in <mailto:schandra@mit.gov.in>
>>>>>>
>>>>>>
>>>>>>     -- 
>>>>>>
>>>>>>     Prashant Verma I  Sr. Software Engineer
>>>>>>     W3C India
>>>>>>     New Delhi
>>>>>>     Cell : +91-8800521042 <tel:%2B91-8800521042>
>>>>>>     Website : http://www.w3cindia.in <http://www.w3cindia.in/>
>>>>>>     --
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>> --
>
Received on Monday, 11 February 2013 20:39:59 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 11 February 2013 20:40:00 GMT