Arul Kolla

arulk@mit.edu

Scholar Title

MIT EECS | Citadel Undergraduate Research and Innovation Scholar

Research Title

Efficient Generative AI Inference via Nonlinear Kernel Approximation

Cohort

2025–2026

Department

Electrical Engineering and Computer Science

Research Areas

AI and Machine Learning

Supervisor

Chandrakasan, Anantha P.

anantha@mtl.mit.edu

Abstract

Modern accelerators excel at linear operations, but nonlinearities take up a significant portion of compute for both datacenter and edge deployments. We propose a co-design approach that (1) develops approximation algorithms for key nonlinear functions in generative models and (2) maps these approximations to efficient hardware on server/edge GPUs and domain-specific accelerators. We will compare families of approximations and study their effects on accuracy, fine-tuning convergence, and robustness under distribution shift. On the hardware side, we will analyze how each method composes with existing compute primitives and propose microarchitectural extensions to reduce latency, area, and energy. Our evaluation combines analytical models with prototype implementations to quantify trade-offs across area, delay, and energy. The outcome is a systematic recipe for algorithm–hardware co-design that closes the nonlinear bottleneck in large-scale AI systems.

Quote

Through this SuperUROP, I aim to translate theory into working systems by prototyping hardware-friendly approximations for nonlinear layers and validating them end-to-end on real-world models. I’m excited to apply my knowledge of ML from classes like Hardware Architecture for Deep Learning (6.5930) to real-world systems. My goal is to publish practical techniques that accelerate both datacenter and edge inference.

Back to Scholars