Inimai A. Subramanian

Inimai A. Subramanian

Scholar Title

MIT EECS | Landsman Undergraduate Research and Innovation Scholar

Research Title

Fixed Parameter Sparse Expansion

Cohort

2024–2025

Department

Electrical Engineering and Computer Science

Research Areas
  • AI and Machine Learning
Supervisor

Nir N. Shavit

Abstract

Previous scaling laws for sparsity have shown that, for the same parameter count, larger, sparse models outperform smaller, dense ones. Furthermore, there has been significant work in the realm of neural network growth – using a small, pretrained model to seed the training of a larger neural network, without having to train the large one from scratch. However, growing a smaller network into a larger network, while keeping the parameter count fixed, has not previously been explored. We duplicate the neurons in a layer and apply sparse masking techniques such that each original weight appears exactly once across the expanded neurons. This maintains the parameter count while allowing for increased representational capacity through additional neurons. Doing so would allow additional performance at a negligible increase in training and inference cost. This is as opposed to the other two avenues, where post-training sparsification requires a significant amount of training compute, and where unconstrained network growth requires additional inference cost. We have shown initial promising results in toy models, and are working to bring our findings into more state of the art systems, such as language models.

Quote

The SuperUROP program will allow me to deepen my knowledge in machine learning and take ownership of a project. Having built a machine learning tool in an internship and taken related classes, I hope to apply my skills to advanced research, further my understanding of developing intelligent systems, and provide meaningful contributions to the field. The hands-on experience gained will prepare me to tackle future engineering challenges.

Back to Scholars