Re: Fwd: ABNF defintion of Indic syllable

Thanks Martin. Will look into it.
Regards,
Somnath 
 
On 06/24/14 01:54 PM, "Martin J. Dürst" <duerst@it.aoyama.ac.jp> wrote:
> 
> Hello Somnath,
> 
> On 2014/06/24 13:47, Somnath Chandra wrote:
> >Dear All,
> >
> >Pl find the revised definition of Indic Syllable as per the appended mail , which has been circulated on June 17, 2014. The definition is generic in nature to suit most of Indian Languages [11 languages tested]. Pl send your feedback  towards finalization.
> >
> >With regards,
> >Somnath
> >
> >-------- Original Message --------
> >From: Swaran Lata <slata@deity.gov.in>
> >Date: Jun 17, 2014 5:20:20 PM
> >Subject: ABNF defintion of Indic syllable
> >To: public-i18n-indic@w3.org
> >Cc: Somnath Chandra <schandra@mit.gov.in>, Manoj Jain <mjain@mit.gov.in>
> >
> >
> >Dear All,
> >
> >
> >The definition of Indic syllable has been revised as under :
> >
> >V[m] |{CH}C[v][m]|CH
> >
> >
> >
> >
> >The Linguistic definition of Indic syllable has been mapped to ABNF(Augmented Backus–Naur Form) for the purpose of text segmentation, Line breaking , Drop letter, letter spacing in horizontal text and vertical text representation. The definition has been elaborated taking Hindi as an example.
> >
> >
> >
> >
> >The definition is combination of 3 rules :
> >
> >
> >
> >
> >Rule 1 : V[m]
> >
> >Rule 2 : {CH}C[v][m]
> >
> >Rule 3 : CH  (This rule is applicable only at the end of the word)
> 
> In European languages, as far as I know, a final consonant would be considered part of the preceding syllable, not a syllable on its own.
> As an example, "cat" would be considered as one syllable, not two syllables ("ca" and "t").
> 
> Would one really put a word-final consonant on a new line in Indic languages?
> 
> Just wondering.
> 
> Regards,   Martin.
> 
> 
> >V(Upper case) is complete vowel
> >
> >m is modifier(Anusvara/Visarga/Chandrabindu)
> >
> >C is Consonant as per Unicode definition which may or may not include nukta
> >
> >  v (lower case) is any dependent vowel or vowel sign (mātrā)
> >
> >H is halant / virama
> >
> >| is a rule seperator
> >
> >[ ] - The enclosed items is optional under this bracket
> >
> >{} - The enclosed item/items occurs once or repeated multiple times
> >
> >
> >
> >
> >Examples:
> >
> >Rule 1 : V[m]
> >
> >
> >
> >Sl. No.
> >
> >Examples
> >
> >Definition
> >
> >1.
> >
> >अ, ई, उ
> >
> >V (Vowel) is a syllable
> >
> >
> >
> >2.
> >
> >अं, उँ, आः
> >
> >V+ Modifier is a syllable
> >
> >
> >
> >
> >
> >
> >Rule 2 : {CH}C[v][m]
> >
> >
> >
> >Sl. No.
> >
> >Examples
> >
> >Definition
> >
> >1.
> >
> >र, क, ज, ल, म
> >
> >Consonant is a syllable
> >
> >2.
> >
> >प्प,क्ख,च्त, ज्ज्व, त्क्ल,त्स्न
> >
> >
> >
> >
> >
> >Zero or more Consonant + Virama sequences followed by consonant is a syllable
> >
> >
> >
> >3.
> >
> >र्त, र्त्स, र्त्स्न, र्त्स्न्य, फ़्क़
> >
> >Zero or more Consonant (Nukta) +Virama  followed by consonant is a syllable
> >
> >
> >
> >4.
> >
> >र्ता, र्त्स्न्या, फ़्जी, क्या
> >
> >Zero or more consonant+ (Nukta)+ virāma sequences followed by a consonant (+Nukta) followed by a vowel sign is a syllable
> >
> >5.
> >
> >तः,स्तं, स्त्रँ, स्तः, फ़्ज़ँ
> >
> >
> >
> >
> >
> >
> >
> >zero or more consonant+ (Nukta)+ virāma sequences followed by a consonant (+Nukta) followed by modifier is a syllable
> >
> >6.
> >
> >र्त्स्न्या: त्स्न्युं, त्स्न्युँ, फ़्ज़ें,हिं
> >
> >zero or more consonant+ (Nukta)+ virāma sequences followed by a consonant (+Nukta) followed by a vowel sign and modifier is a syllable
> >
> >7.
> >
> >स्थि,ज्जि,ख्वा
> >
> >Zero or more Consonant +halant sequences followed by a consonant followed by vowel sign is a syllable
> >
> >
> >
> >
> >Rule 3 : CH
> >
> >त् , व् , म् , भ् etc are syllable in Hindi only at the end of the word
> >
> >Examples of combination of the rules :
> >
> >1.   स्वागतम् -  CHCv + C + C + CH has following syllables :
> >
> >
> >
> >स्वा
> >
> >CHCv
> >
> >ग
> >
> >C
> >
> >त
> >
> >C
> >
> >म्
> >
> >CH
> >
> >
> >
> >
> >2. भरतनाट्यम- C + C + C + Cv + CHC + C
> >
> >
> >
> >भ
> >
> >C
> >
> >र
> >
> >C
> >
> >त
> >
> >C
> >
> >ना
> >
> >Cv
> >
> >ट्य
> >
> >CHC
> >
> >म
> >
> >C
> >
> >
> >
> >
> >3. सद्बुद्धि - C + CHCv + CHCv
> >
> >
> >
> >स
> >
> >C
> >
> >द्बु
> >
> >CHCv
> >
> >द्धि
> >
> >CHCv
> >
> >
> >
> >
> >The proposed definition is generic in nature and has already being tested for 11 Indian languages i.e Hindi, Marathi, Bengali, Nepali, Tamil, Telugu, Kannada, Gujarati, Punjabi, Oriya  & Malayalam. The new rule for CH(Consonant+ Halant) occurrence at the end of the word has been introduced. The link of the test suite is available at http://w3cindia.in/syllable-generator.aspx.The testing of the remaining languages is underway.
> >
> >             I request you to kindly give your valuable feedback.
> >
> >
> >
> >
> >regards,
> >
> 
> 
-- 

Dr. Somnath Chandra
Scientist-E
Dept. of Electronics & Information Technology
Ministry of Communications & Information Technology
Govt. of India
Tel:+91-11-24364744,24301856
Fax: +91-11-24363099
e-mail :schandra@mit.gov.in

Received on Tuesday, 24 June 2014 08:47:03 UTC