Maanas K. Sharma
Language Models and Within-Language Discrimination
2024–2025
Electrical Engineering and Computer Science
- AI and Machine Learning
Marzyeh Ghassemi
Generative AI language models (LMs) have exploded in usage, but have significant problems with social biases, misinformation, security, and more. This project examines the performance and biases of large LMs on dialects within a specific language. I seek to establish how within-language discrimination can manifest in real-life use cases, including in decision-making scenarios. After that, I seek to understand where and how these behaviors form in language models, looking forward towards possible interventions.
I am excited to continue my research in machine learning and fairness through the SuperUROP program. I feel best prepared for this project by the class 6.S977, a prior UROP with my supervisor, and two years in the SERC Scholars program. I think this work answers an important question, and I am further motivated by learning new technical skills like writing research codebases, language models, and interpretability techniques.