- From: Surupendu Gangopadhyay <surupendu.g@gmail.com>
- Date: Tue, 25 Jul 2023 12:42:12 +0530
- To: undisclosed-recipients:;
- Message-ID: <CA+k+8boNRka6R_Nyxo3u3NXYWELudJMpc5oqA_+B-kdyxuDX0Q@mail.gmail.com>
Apologies for multiple posting *********************************** *------------------------------------------------------------------------------------------------Machine Translation for Indian Languages (MTIL) 2023------------------------------------------------------------------------------------------------* We invite all IR and NLP researchers and enthusiasts to participate in the MTIL track (https://mtilfire.github.io/mtil/2023/) held in conjunction with the Forum for Information Retrieval Evaluation (FIRE) 2023 ( http://fire.irsi.res.in/). Indian languages have many linguistic complexities. Though some Indian languages share syntactic similarities, some possess intricate morphological structures. At the same time, some Indian languages are low-resource. Therefore the machine translation models should address these unique challenges in translating between Indian languages. The MTIL track consists of two tasks: 1. *General Translation Task (Task 1):* Task participants should build a machine translation model to translate sentences of the following language pairs: 1. Hindi-Gujarati 2. Hindi-Kannada 3. Kannada-Hindi 4. Hindi-Odia 5. Odia-Hindi 6. Hindi-Punjabi 7. Punjabi-Hindi 8. Hindi-Sindhi 9. Urdu-Kashmiri 10. Telugu-Hindi 11. Hindi-Telugu 12. Urdu-Hindi 13. Hindi-Urdu 2. *Domain Specific Translation Task (Task 2)*: Task participants will build machine translation models for Governance and Healthcare domains. 1. Healthcare: a. Hindi-Gujarati b. Kannada-Hindi c. Hindi-Odia d. Odia-Hindi e. Hindi-Punjabi f. Kannada-Hindi 2. Governance: a. Hindi-Gujarati b. Kannada-Hindi c. Hindi-Odia d. Odia-Hindi e. Hindi-Punjabi f. Kannada-Hindi *Dataset:* The primary source of parallel language pairs is Bharat Parallel Corpus Collection (BPCC), released by AI4Bharat (https://ai4bharat.iitm.ac.in/bpcc ). Participants are encouraged to add datasets of their choice, including parallel corpora and monolingual datasets, to train their models. More information on registration and participation in the track can be found here: https://mtilfire.github.io/mtil/2023/ This track is being done in association with BHASHINI ( https://bhashini.gov.in/) *Organisers* - Prasenjit Majumder, DAIICT Gandhinagar,India and TCG CREST, Kolkata,India - Arafat Ahsan, IIIT-Hyderabad,India - Asif Ekbal, IIT-Patna,India - Saran Pandian, DAIICT Gandhinagar,India - Ramakrishna Appicharla, IIT-Patna ,India - Surupendu Gangopadhyay, DAIICT Gandhinagar,India - Ganesh Epili, DAIICT Gandhinagar,India - Dreamy Pujara, DAIICT Gandhinagar,India - Misha Patel, DAIICT Gandhinagar,India - Aayushi Patel, DAIICT Gandhinagar,India - Bhargav Dave, DAIICT Gandhinagar,India - Mukesh Jha, DAIICT Gandhinagar,India
Received on Tuesday, 25 July 2023 07:12:29 UTC