Medha Venkatapathy
Optimizing the Muon Optimizer
2025–2026
Physics; Electrical Engineering and Computer Science
- AI and Machine Learning
- Natural Language and Speech Processing
Andreas, Jacob
The Muon optimizer is a recent and faster neural network optimizer that uses momentum with an additional orthogonalization step based on the Newton-Schulz iteration. My project focuses on improving post-training methods for large language models, evaluating Muon’s stability and efficiency compared to standard optimizers, and exploring how it improves the fine-tuning performance on hidden weight layers.
I am excited to pursue SuperUROP to deepen my understanding of natural language processing, a field I believe will be important for many years. My background includes linguistics coursework and research with the LIGO group, which prepared me to explore a niche subject over an extended period.
