- From: Somnath Chandra <schandra@deity.gov.in>
- Date: Tue, 24 Jun 2014 14:15:58 +0530
- To: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, indic <public-i18n-indic@w3.org>
- Cc: slata <slata@mit.gov.in>, Manoj Jain <mjain@deity.gov.in>, prashant verma <vermaprashant1@gmail.com>
- Message-id: <fb1cbf12107f5.53a9881e@nic.in>
Thanks Martin. Will look into it.
Regards,
Somnath
On 06/24/14 01:54 PM, "Martin J. Dürst" <duerst@it.aoyama.ac.jp> wrote:
>
> Hello Somnath,
>
> On 2014/06/24 13:47, Somnath Chandra wrote:
> >Dear All,
> >
> >Pl find the revised definition of Indic Syllable as per the appended mail , which has been circulated on June 17, 2014. The definition is generic in nature to suit most of Indian Languages [11 languages tested]. Pl send your feedback towards finalization.
> >
> >With regards,
> >Somnath
> >
> >-------- Original Message --------
> >From: Swaran Lata <slata@deity.gov.in>
> >Date: Jun 17, 2014 5:20:20 PM
> >Subject: ABNF defintion of Indic syllable
> >To: public-i18n-indic@w3.org
> >Cc: Somnath Chandra <schandra@mit.gov.in>, Manoj Jain <mjain@mit.gov.in>
> >
> >
> >Dear All,
> >
> >
> >The definition of Indic syllable has been revised as under :
> >
> >V[m] |{CH}C[v][m]|CH
> >
> >
> >
> >
> >The Linguistic definition of Indic syllable has been mapped to ABNF(Augmented Backus–Naur Form) for the purpose of text segmentation, Line breaking , Drop letter, letter spacing in horizontal text and vertical text representation. The definition has been elaborated taking Hindi as an example.
> >
> >
> >
> >
> >The definition is combination of 3 rules :
> >
> >
> >
> >
> >Rule 1 : V[m]
> >
> >Rule 2 : {CH}C[v][m]
> >
> >Rule 3 : CH (This rule is applicable only at the end of the word)
>
> In European languages, as far as I know, a final consonant would be considered part of the preceding syllable, not a syllable on its own.
> As an example, "cat" would be considered as one syllable, not two syllables ("ca" and "t").
>
> Would one really put a word-final consonant on a new line in Indic languages?
>
> Just wondering.
>
> Regards, Martin.
>
>
> >V(Upper case) is complete vowel
> >
> >m is modifier(Anusvara/Visarga/Chandrabindu)
> >
> >C is Consonant as per Unicode definition which may or may not include nukta
> >
> > v (lower case) is any dependent vowel or vowel sign (mÄtrÄ)
> >
> >H is halant / virama
> >
> >| is a rule seperator
> >
> >[ ] - The enclosed items is optional under this bracket
> >
> >{} - The enclosed item/items occurs once or repeated multiple times
> >
> >
> >
> >
> >Examples:
> >
> >Rule 1 : V[m]
> >
> >
> >
> >Sl. No.
> >
> >Examples
> >
> >Definition
> >
> >1.
> >
> >अ, ई, उ
> >
> >V (Vowel) is a syllable
> >
> >
> >
> >2.
> >
> >अं, उà¤, आः
> >
> >V+ Modifier is a syllable
> >
> >
> >
> >
> >
> >
> >Rule 2 : {CH}C[v][m]
> >
> >
> >
> >Sl. No.
> >
> >Examples
> >
> >Definition
> >
> >1.
> >
> >र, क, ज, ल, म
> >
> >Consonant is a syllable
> >
> >2.
> >
> >पà¥à¤ª,कà¥à¤–,चà¥à¤¤, जà¥à¤œà¥à¤µ, तà¥à¤•à¥à¤²,तà¥à¤¸à¥à¤¨
> >
> >
> >
> >
> >
> >Zero or more Consonant + Virama sequences followed by consonant is a syllable
> >
> >
> >
> >3.
> >
> >रà¥à¤¤, रà¥à¤¤à¥à¤¸, रà¥à¤¤à¥à¤¸à¥à¤¨, रà¥à¤¤à¥à¤¸à¥à¤¨à¥à¤¯, फ़à¥à¤•़
> >
> >Zero or more Consonant (Nukta) +Virama followed by consonant is a syllable
> >
> >
> >
> >4.
> >
> >रà¥à¤¤à¤¾, रà¥à¤¤à¥à¤¸à¥à¤¨à¥à¤¯à¤¾, फ़à¥à¤œà¥€, कà¥à¤¯à¤¾
> >
> >Zero or more consonant+ (Nukta)+ virÄma sequences followed by a consonant (+Nukta) followed by a vowel sign is a syllable
> >
> >5.
> >
> >तः,सà¥à¤¤à¤‚, सà¥à¤¤à¥à¤°à¤, सà¥à¤¤à¤ƒ, फ़à¥à¤œà¤¼à¤
> >
> >
> >
> >
> >
> >
> >
> >zero or more consonant+ (Nukta)+ virÄma sequences followed by a consonant (+Nukta) followed by modifier is a syllable
> >
> >6.
> >
> >रà¥à¤¤à¥à¤¸à¥à¤¨à¥à¤¯à¤¾: तà¥à¤¸à¥à¤¨à¥à¤¯à¥à¤‚, तà¥à¤¸à¥à¤¨à¥à¤¯à¥à¤, फ़à¥à¤œà¤¼à¥‡à¤‚,हिं
> >
> >zero or more consonant+ (Nukta)+ virÄma sequences followed by a consonant (+Nukta) followed by a vowel sign and modifier is a syllable
> >
> >7.
> >
> >सà¥à¤¥à¤¿,जà¥à¤œà¤¿,खà¥à¤µà¤¾
> >
> >Zero or more Consonant +halant sequences followed by a consonant followed by vowel sign is a syllable
> >
> >
> >
> >
> >Rule 3 : CH
> >
> >तॠ, वॠ, मॠ, à¤à¥ etc are syllable in Hindi only at the end of the word
> >
> >Examples of combination of the rules :
> >
> >1. सà¥à¤µà¤¾à¤—तमॠ- CHCv + C + C + CH has following syllables :
> >
> >
> >
> >सà¥à¤µà¤¾
> >
> >CHCv
> >
> >ग
> >
> >C
> >
> >त
> >
> >C
> >
> >मà¥
> >
> >CH
> >
> >
> >
> >
> >2. à¤à¤°à¤¤à¤¨à¤¾à¤Ÿà¥à¤¯à¤®- C + C + C + Cv + CHC + C
> >
> >
> >
> >à¤
> >
> >C
> >
> >र
> >
> >C
> >
> >त
> >
> >C
> >
> >ना
> >
> >Cv
> >
> >टà¥à¤¯
> >
> >CHC
> >
> >म
> >
> >C
> >
> >
> >
> >
> >3. सदà¥à¤¬à¥à¤¦à¥à¤§à¤¿ - C + CHCv + CHCv
> >
> >
> >
> >स
> >
> >C
> >
> >दà¥à¤¬à¥
> >
> >CHCv
> >
> >दà¥à¤§à¤¿
> >
> >CHCv
> >
> >
> >
> >
> >The proposed definition is generic in nature and has already being tested for 11 Indian languages i.e Hindi, Marathi, Bengali, Nepali, Tamil, Telugu, Kannada, Gujarati, Punjabi, Oriya & Malayalam. The new rule for CH(Consonant+ Halant) occurrence at the end of the word has been introduced. The link of the test suite is available at http://w3cindia.in/syllable-generator.aspx.The testing of the remaining languages is underway.
> >
> > I request you to kindly give your valuable feedback.
> >
> >
> >
> >
> >regards,
> >
>
>
--
Dr. Somnath Chandra
Scientist-E
Dept. of Electronics & Information Technology
Ministry of Communications & Information Technology
Govt. of India
Tel:+91-11-24364744,24301856
Fax: +91-11-24363099
e-mail :schandra@mit.gov.in
Received on Tuesday, 24 June 2014 08:47:03 UTC