W3C home > Mailing lists > Public > public-multilingualweb-lt-comments@w3.org > January 2013

[ISSUE-108] locNote ITS 2.0 requirements w.r.t Indian [Indic] languages

From: Dave Lewis <dave.lewis@cs.tcd.ie>
Date: Wed, 23 Jan 2013 02:11:09 +0000
Message-ID: <50FF46BD.8030707@cs.tcd.ie>
To: Felix Sasaki <fsasaki@w3.org>
CC: "public-multilingualweb-lt-comments@w3.org" <public-multilingualweb-lt-comments@w3.org>, Somnath Chandra <schandra@deity.gov.in>, slata <slata@mit.gov.in>
Hi Somnath,
In relation to localizationNote, you indicate that correct translation 
of some Indic language is sensitive to the part of speech annotation and 
suggest some encodings that could be used with LocalisationNote to 
provide such annotation.

My personal feeling is that including such an encoding to the format of 
this data category would add a large implementation burden to many 
potential adopter who would not require it. However, we have seen with 
ITS 1.0 that companies have use the value of this attribute to encode 
thier own name:value formats.
e.g.
<span its-loc-note="pos:N_NNN">????</span>

Would this address you comment?

This would not require a change to the specification, but could be 
capture in a separate best practice document, perhaps specifically 
targetting the use of ITS for indic languages more generally if you were 
interested in contributing to this.

Two other possibilities may exist.
1) an entirely new part-of-speech data category. This is currently 
outside of scope of ITS2.0, but we could collect requirements as we are 
starting to record such needs that are not covered by ITS2.0 for future 
activities.

2) I believe  the NLP interchange format (NIF) can encode details such 
as POS. ITS2.0 has a mapping to NIF, so perhaps this POS information 
could be usefully recorded externally to the document using NIF within 
the context of this mapping. But I'd ask the NIF experts, Felix and 
Sebastien, to comment on this possibility.

Please let us know what you think.
kind regards,

Dave Lewis

On 16/01/2013 08:52, Felix Sasaki wrote:
> Forwarded on behalf of Somnath Chandra (by permission), with CC to 
> Somnath and Svaran Lata.The comments are not yet in tracker. See also 
> new comments (also not in tracker yet) on the www-international list at
> http://lists.w3.org/Archives/Public/www-international/2013JanMar/0065.html
>
> If you have input for replying to the comments, please provide it on 
> our comments list (but feel free to put others in CC to speed up the 
> process).
>
> Best,
>
> Felix
>
>
> -------- Original-Nachricht --------
> Betreff: 	Fwd: ITS 2.0 requirements w.r.t Indian languages
> Datum: 	Wed, 16 Jan 2013 13:46:52 +0530
> Von: 	Somnath Chandra <schandra@deity.gov.in>
> An: 	Felix Sasaki <fsasaki@w3.org>
> Kopie (CC): 	slata <slata@mit.gov.in>
>
>
>
> Dear Dr. Felix Sasaki,
> W3C India has compiled the Indic Languages requirements for ITS 2.0. 
> Kindly find enclosed the draft document developed for the purpose.
> Submitted for kind perusal. Please feel free to contact me for any 
> further clarifications / discussions.
> With best regards,
> Somnath , W3C India
> Dr. Somnath Chandra
> Joint Director and Dy. Country Manager , W3C India
> Dept. of Electronics & Information Technology
> Ministry of Communications & Information Technology
> Govt. of India
> Tel:+91-11-24364744,24301811
> Fax: +91-11-24363099
> e-mail :schandra@mit.gov.in
> -------- Original Message --------
> From: *Prashant Verma *<vermaprashant1@gmail.com>
> Date: Jan 10, 2013 2:20:32 PM
> Subject: ITS 2.0 requirements w.r.t Indian languages
> To: schandra@mit.gov.in
>
>
> -- 
>
> Prashant Verma I  Sr. Software Engineer
> W3C India
> New Delhi
> Cell : +91-8800521042
> Website : http://www.w3cindia.in <http://www.w3cindia.in/>
> --
>
>
Received on Wednesday, 23 January 2013 02:11:48 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 19:55:32 UTC