- From: Somnath Chandra <schandra@deity.gov.in>
- Date: Tue, 24 Jun 2014 14:15:58 +0530
- To: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, indic <public-i18n-indic@w3.org>
- Cc: slata <slata@mit.gov.in>, Manoj Jain <mjain@deity.gov.in>, prashant verma <vermaprashant1@gmail.com>
- Message-id: <fb1cbf12107f5.53a9881e@nic.in>
Thanks Martin. Will look into it. Regards, Somnath On 06/24/14 01:54 PM, "Martin J. Dürst" <duerst@it.aoyama.ac.jp> wrote: > > Hello Somnath, > > On 2014/06/24 13:47, Somnath Chandra wrote: > >Dear All, > > > >Pl find the revised definition of Indic Syllable as per the appended mail , which has been circulated on June 17, 2014. The definition is generic in nature to suit most of Indian Languages [11 languages tested]. Pl send your feedback towards finalization. > > > >With regards, > >Somnath > > > >-------- Original Message -------- > >From: Swaran Lata <slata@deity.gov.in> > >Date: Jun 17, 2014 5:20:20 PM > >Subject: ABNF defintion of Indic syllable > >To: public-i18n-indic@w3.org > >Cc: Somnath Chandra <schandra@mit.gov.in>, Manoj Jain <mjain@mit.gov.in> > > > > > >Dear All, > > > > > >The definition of Indic syllable has been revised as under : > > > >V[m] |{CH}C[v][m]|CH > > > > > > > > > >The Linguistic definition of Indic syllable has been mapped to ABNF(Augmented Backus–Naur Form) for the purpose of text segmentation, Line breaking , Drop letter, letter spacing in horizontal text and vertical text representation. The definition has been elaborated taking Hindi as an example. > > > > > > > > > >The definition is combination of 3 rules : > > > > > > > > > >Rule 1 : V[m] > > > >Rule 2 : {CH}C[v][m] > > > >Rule 3 : CH (This rule is applicable only at the end of the word) > > In European languages, as far as I know, a final consonant would be considered part of the preceding syllable, not a syllable on its own. > As an example, "cat" would be considered as one syllable, not two syllables ("ca" and "t"). > > Would one really put a word-final consonant on a new line in Indic languages? > > Just wondering. > > Regards, Martin. > > > >V(Upper case) is complete vowel > > > >m is modifier(Anusvara/Visarga/Chandrabindu) > > > >C is Consonant as per Unicode definition which may or may not include nukta > > > > v (lower case) is any dependent vowel or vowel sign (mÄtrÄ) > > > >H is halant / virama > > > >| is a rule seperator > > > >[ ] - The enclosed items is optional under this bracket > > > >{} - The enclosed item/items occurs once or repeated multiple times > > > > > > > > > >Examples: > > > >Rule 1 : V[m] > > > > > > > >Sl. No. > > > >Examples > > > >Definition > > > >1. > > > >अ, ई, उ > > > >V (Vowel) is a syllable > > > > > > > >2. > > > >अं, उà¤, आः > > > >V+ Modifier is a syllable > > > > > > > > > > > > > >Rule 2 : {CH}C[v][m] > > > > > > > >Sl. No. > > > >Examples > > > >Definition > > > >1. > > > >र, क, ज, ल, म > > > >Consonant is a syllable > > > >2. > > > >पà¥à¤ª,कà¥à¤–,चà¥à¤¤, जà¥à¤œà¥à¤µ, तà¥à¤•à¥à¤²,तà¥à¤¸à¥à¤¨ > > > > > > > > > > > >Zero or more Consonant + Virama sequences followed by consonant is a syllable > > > > > > > >3. > > > >रà¥à¤¤, रà¥à¤¤à¥à¤¸, रà¥à¤¤à¥à¤¸à¥à¤¨, रà¥à¤¤à¥à¤¸à¥à¤¨à¥à¤¯, फ़à¥à¤•à¤¼ > > > >Zero or more Consonant (Nukta) +Virama followed by consonant is a syllable > > > > > > > >4. > > > >रà¥à¤¤à¤¾, रà¥à¤¤à¥à¤¸à¥à¤¨à¥à¤¯à¤¾, फ़à¥à¤œà¥€, कà¥à¤¯à¤¾ > > > >Zero or more consonant+ (Nukta)+ virÄma sequences followed by a consonant (+Nukta) followed by a vowel sign is a syllable > > > >5. > > > >तः,सà¥à¤¤à¤‚, सà¥à¤¤à¥à¤°à¤, सà¥à¤¤à¤ƒ, फ़à¥à¤œà¤¼à¤ > > > > > > > > > > > > > > > >zero or more consonant+ (Nukta)+ virÄma sequences followed by a consonant (+Nukta) followed by modifier is a syllable > > > >6. > > > >रà¥à¤¤à¥à¤¸à¥à¤¨à¥à¤¯à¤¾: तà¥à¤¸à¥à¤¨à¥à¤¯à¥à¤‚, तà¥à¤¸à¥à¤¨à¥à¤¯à¥à¤, फ़à¥à¤œà¤¼à¥‡à¤‚,हिं > > > >zero or more consonant+ (Nukta)+ virÄma sequences followed by a consonant (+Nukta) followed by a vowel sign and modifier is a syllable > > > >7. > > > >सà¥à¤¥à¤¿,जà¥à¤œà¤¿,खà¥à¤µà¤¾ > > > >Zero or more Consonant +halant sequences followed by a consonant followed by vowel sign is a syllable > > > > > > > > > >Rule 3 : CH > > > >तॠ, वॠ, मॠ, à¤à¥ etc are syllable in Hindi only at the end of the word > > > >Examples of combination of the rules : > > > >1. सà¥à¤µà¤¾à¤—तमॠ- CHCv + C + C + CH has following syllables : > > > > > > > >सà¥à¤µà¤¾ > > > >CHCv > > > >ग > > > >C > > > >त > > > >C > > > >मॠ> > > >CH > > > > > > > > > >2. à¤à¤°à¤¤à¤¨à¤¾à¤Ÿà¥à¤¯à¤®- C + C + C + Cv + CHC + C > > > > > > > >ठ> > > >C > > > >र > > > >C > > > >त > > > >C > > > >ना > > > >Cv > > > >टà¥à¤¯ > > > >CHC > > > >म > > > >C > > > > > > > > > >3. सदà¥à¤¬à¥à¤¦à¥à¤§à¤¿ - C + CHCv + CHCv > > > > > > > >स > > > >C > > > >दà¥à¤¬à¥ > > > >CHCv > > > >दà¥à¤§à¤¿ > > > >CHCv > > > > > > > > > >The proposed definition is generic in nature and has already being tested for 11 Indian languages i.e Hindi, Marathi, Bengali, Nepali, Tamil, Telugu, Kannada, Gujarati, Punjabi, Oriya & Malayalam. The new rule for CH(Consonant+ Halant) occurrence at the end of the word has been introduced. The link of the test suite is available at http://w3cindia.in/syllable-generator.aspx.The testing of the remaining languages is underway. > > > > I request you to kindly give your valuable feedback. > > > > > > > > > >regards, > > > > -- Dr. Somnath Chandra Scientist-E Dept. of Electronics & Information Technology Ministry of Communications & Information Technology Govt. of India Tel:+91-11-24364744,24301856 Fax: +91-11-24363099 e-mail :schandra@mit.gov.in
Received on Tuesday, 24 June 2014 08:47:03 UTC