Alert button
Picture for Pratyush Kumar

Pratyush Kumar

Alert button

IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages

Add code
Bookmark button
Alert button
Mar 11, 2024
Mohammed Safi Ur Rahman Khan, Priyam Mehta, Ananth Sankar, Umashankar Kumaravelan, Sumanth Doddapaneni, Suriyaprasaad G, Varun Balan G, Sparsh Jain, Anoop Kunchukuttan, Pratyush Kumar, Raj Dabre, Mitesh M. Khapra

Figure 1 for IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages
Figure 2 for IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages
Figure 3 for IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages
Figure 4 for IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages
Viaarxiv icon

IndicVoices: Towards building an Inclusive Multilingual Speech Dataset for Indian Languages

Add code
Bookmark button
Alert button
Mar 04, 2024
Tahir Javed, Janki Atul Nawale, Eldho Ittan George, Sakshi Joshi, Kaushal Santosh Bhogale, Deovrat Mehendale, Ishvinder Virender Sethi, Aparna Ananthanarayanan, Hafsah Faquih, Pratiti Palit, Sneha Ravishankar, Saranya Sukumaran, Tripura Panchagnula, Sunjay Murali, Kunal Sharad Gandhi, Ambujavalli R, Manickam K M, C Venkata Vaijayanthi, Krishnan Srinivasa Raghavan Karunganni, Pratyush Kumar, Mitesh M Khapra

Viaarxiv icon

DSFormer: Effective Compression of Text-Transformers by Dense-Sparse Weight Factorization

Add code
Bookmark button
Alert button
Dec 20, 2023
Rahul Chand, Yashoteja Prabhu, Pratyush Kumar

Viaarxiv icon

IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled Indian Languages

Add code
Bookmark button
Alert button
May 25, 2023
AI4Bharat, Jay Gala, Pranjal A. Chitale, Raghavan AK, Sumanth Doddapaneni, Varun Gumma, Aswanth Kumar, Janki Nawale, Anupama Sujatha, Ratish Puduppully, Vivek Raghavan, Pratyush Kumar, Mitesh M. Khapra, Raj Dabre, Anoop Kunchukuttan

Figure 1 for IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled Indian Languages
Figure 2 for IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled Indian Languages
Figure 3 for IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled Indian Languages
Figure 4 for IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled Indian Languages
Viaarxiv icon

Svarah: Evaluating English ASR Systems on Indian Accents

Add code
Bookmark button
Alert button
May 25, 2023
Tahir Javed, Sakshi Joshi, Vignesh Nagarajan, Sai Sundaresan, Janki Nawale, Abhigyan Raman, Kaushal Bhogale, Pratyush Kumar, Mitesh M. Khapra

Figure 1 for Svarah: Evaluating English ASR Systems on Indian Accents
Figure 2 for Svarah: Evaluating English ASR Systems on Indian Accents
Figure 3 for Svarah: Evaluating English ASR Systems on Indian Accents
Figure 4 for Svarah: Evaluating English ASR Systems on Indian Accents
Viaarxiv icon

Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR

Add code
Bookmark button
Alert button
May 24, 2023
Kaushal Santosh Bhogale, Sai Sundaresan, Abhigyan Raman, Tahir Javed, Mitesh M. Khapra, Pratyush Kumar

Figure 1 for Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR
Figure 2 for Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR
Figure 3 for Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR
Figure 4 for Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR
Viaarxiv icon

Large Language Models Humanize Technology

Add code
Bookmark button
Alert button
May 09, 2023
Pratyush Kumar

Figure 1 for Large Language Models Humanize Technology
Figure 2 for Large Language Models Humanize Technology
Figure 3 for Large Language Models Humanize Technology
Figure 4 for Large Language Models Humanize Technology
Viaarxiv icon

An Empirical Study of Leveraging Knowledge Distillation for Compressing Multilingual Neural Machine Translation Models

Add code
Bookmark button
Alert button
Apr 19, 2023
Varun Gumma, Raj Dabre, Pratyush Kumar

Figure 1 for An Empirical Study of Leveraging Knowledge Distillation for Compressing Multilingual Neural Machine Translation Models
Figure 2 for An Empirical Study of Leveraging Knowledge Distillation for Compressing Multilingual Neural Machine Translation Models
Figure 3 for An Empirical Study of Leveraging Knowledge Distillation for Compressing Multilingual Neural Machine Translation Models
Figure 4 for An Empirical Study of Leveraging Knowledge Distillation for Compressing Multilingual Neural Machine Translation Models
Viaarxiv icon

IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages

Add code
Bookmark button
Alert button
Dec 20, 2022
Ananya B. Sai, Vignesh Nagarajan, Tanay Dixit, Raj Dabre, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra

Figure 1 for IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages
Figure 2 for IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages
Figure 3 for IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages
Figure 4 for IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages
Viaarxiv icon

Naamapadam: A Large-Scale Named Entity Annotated Data for Indic Languages

Add code
Bookmark button
Alert button
Dec 20, 2022
Arnav Mhaske, Harshit Kedia, Sumanth Doddapaneni, Mitesh M. Khapra, Pratyush Kumar, Rudra Murthy V, Anoop Kunchukuttan

Figure 1 for Naamapadam: A Large-Scale Named Entity Annotated Data for Indic Languages
Figure 2 for Naamapadam: A Large-Scale Named Entity Annotated Data for Indic Languages
Figure 3 for Naamapadam: A Large-Scale Named Entity Annotated Data for Indic Languages
Figure 4 for Naamapadam: A Large-Scale Named Entity Annotated Data for Indic Languages
Viaarxiv icon