MIT EECS | Takeda Undergraduate Research and Innovation Scholar
Gradient Descent, Implicit Regularization, and Alignment in Deep Linear Autoencoders
- Artificial Intelligence and Machine Learning
A mystery in deep learning is why overparametrized neural networks are still able to generalize. Recent work aims to explain this via implicit regularization, where out of the infinitely many solutions that interpolate training data, architecture and optimization method bias us towards simple solutions. Another mystery is why convolutional networks perform well in practice, yet require depth to achieve such performance. I aim to understand these phenomena through theory and experiments in the setting of linear autoencoders. I study the dynamics of gradient descent to better understand which solutions overparametrized networks converge to. I also study how implicit regularization is affected by model architecture, and whether it can explain the success of deep convolutional networks.
“I am participating in SuperUROP to gain experience working on a long-term research project and learn about the entire timeline from formulating a problem to presenting results. I’m excited to use concepts I’ve learned in statistics and machine learning classes to both make progress towards solving challenging theoretical problems as well as designing algorithms with real world applications, learning from other researchers along the way.”