Aaron Alvarado Kristanto Julistiono
MIT Tang Family FinTech Undergraduate Research and Innovation Scholar
Optimization Path for Stochastic Mirror Descent on Attention Mechanisms
- Artificial Intelligence and Machine Learning
Deep learning models have enjoyed enormous success in many tasks, such as natural language processing and computer vision, among others. While deep learning has demonstrably enabled breakthroughs in a wide variety of tasks, we often cannot easily understand what the model is truly learning. To alleviate that, this project investigates how the weight parameters of a deep learning model change throughout training, and by doing so we can show what the models actually learns. Specifically, we investigate the Mirror Descent algorithm, which is an important generalization of the well-known Gradient Descent algorithm, along with the attention model, which has opened up a world of possibilities in many areas of Machine Learning.
I am currently a junior in the Computer Science and Engineering (6-3) major and the Mathematics (18) major of MIT. I am excited to continue this project and make it into a SuperUROP with the goal of preparing myself for PhD. Using the knowledge I gained in classroom, such as Machine Learning, Statistics, and algorithms, I hope to successfully complete this SuperUROP by the end of the year.