Hi! I am Kabir Ahuja. I am a PhD student at University of Washington (UW), advised by Prof. Yulia Tsvetkov. I am broadly interested in Natural Language Processing (NLP), mainly towards understanding and improving the capabilities of language models.

Before joining UW, I spent two wonderful years at Microsoft Research India (MSRI) as a Research Fellow, working with Dr. Sunayana Sitaram , Dr. Monojit Choudhury , and Dr. Navin Goyal . At MSRI, my research primarily focused on Cross Lingual Transfer in Pretrained Multilingual Language Models and understanding mechanisms behind in-context learning in transformers. In past I have also worked on analysis of computational capabilities of Transformers and Recurrent Neural Networks through understanding their behavior on several Formal Languages with Dr. Navin Goyal and on Controlled Text Generation for Syntactic Paraphrasing with Professor Partha Talukdar .

Please feel free to reach out to me over my email if you have any questions regarding my research. I am also happy providing mentorship to students looking to start their research journey in NLP.

Publications

In-Context Learning through the Bayesian Prism
Kabir Ahuja*, Madhur Panwar*, Navin Goyal
| Under Review
preprint|

MEGA: Multilingual Evaluation of Generative AI
Kabir Ahuja, Harshita Diddee, Rishav Hada, Millicent Ochieng, Krithika Ramesh, Prachi Jain, Akshay Nambi, Tanuja Ganu, Sameer Segal, Mohamed Ahmed, Kalika Bali, Sunayana Sitaram
EMNLP 2023 | The 2023 Conference on Empirical Methods in Natural Language Processing
paper| code|

On the Calibration of Massively Multilingual Language Models
Kabir Ahuja, Sunayana Sitaram, Sandipan Dandapat, Monojit Choudhury
EMNLP 2022 | The 2022 Conference on Empirical Methods in Natural Language Processing
preprint| code|

Global Readiness of Language Technology for Healthcare: What would it Take to Combat the Next Pandemic?
Ishani Mondal*, Kabir Ahuja*, Mohit Jain, Jacki O Neil, Kalika Bali, Monojit Choudhury
COLING 2022 | The 29th International Conference on Computational Linguistics
pdf| abstract| cite|

On the Economics of Multilingual Few-shot Learning: Modeling the Cost-Performance Trade-offs of Machine Translated and Manual Data
Kabir Ahuja, Monojit Choudhury, Sandipan Dandapat
NAACL 2022 | 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics
pdf| abstract| code| cite|

Multi Task Learning For Zero Shot Performance Prediction of Multilingual Models
Kabir Ahuja*, Shanu Kumar*, Sandipan Dandapat, Monojit Choudhury
ACL 2022 | 60th Annual Meeting of the Association for Computational Linguistics
pdf| abstract| cite|

Beyond Static Models and Test Sets: Benchmarking the Potential of Pre-trained Models Across Tasks and Languages
Kabir Ahuja, Sandipan Dandapat, Sunayana Sitaram, Monojit Choudhury
NLP Power!, ACL 2022 | The First Workshop on Efficient Benchmarking in NLP
pdf| abstract| cite|

Learning to Optimize Molecular Geometries Using Reinforcement Learning
Kabir Ahuja, William H Green, Yi-Pei Li
JCTC | Journal of Chemical Theory and Computation (2021) 17:818-825 [Impact Factor: 6.006]
abstract | cite|

On the Practical Ability of Recurrent Neural Networks to Recognize Hierarchical Languages
Satwik Bhattamishra, Kabir Ahuja, Navin Goyal
COLING 2020 | Proceedings of the 28th International Conference on Computational Linguistics | [Recipient of the Best Short Paper Award]
pdf| abstract| code| cite|

On the Ability of Self-Attention Networks to Recognize Counter Languages
Satwik Bhattamishra, Kabir Ahuja, Navin Goyal
EMNLP 2020 | The 2020 Conference on Empirical Methods in Natural Language Processing
pdf| abstract| code| cite|

Syntax-Guided Controlled Generation of Paraphrases
Ashutosh Kumar, Kabir Ahuja, Raghuram Vadapalli, Partha Talukdar
TACL | Transactions of the Association for Computational Linguistics (2020) 8:330-345
pdf| abstract| code| cite|

Blog Posts

Painless Fine-Tuning of BERT in Pytorch
blog-link| code

How to use Pytorch Dataloaders to work with enormously large text files
blog-link|

BITS Pilani
2015 - 2019
MIT
F2018
IISC Bangalore
F2019
Udaan
2020-2021
Microsoft Research
2021-Present