Md Sahil (Sahil)  Akhtar

Md Sahil (Sahil) Akhtar

Research Title

Analyzing Corruption Processes and Stability in Discrete Diffusion for Language Modeling

Cohort

2025–2026

Department

Physics; Electrical Engineering and Computer Science

Research Areas
  • Physics
  • Generative AI
Supervisor

Farias, Vivek F.

Abstract

Discrete diffusion models have recently gained traction as a potential alternative to autoregressive approaches for generating discrete data such as text. Their main appeal lies in the ability to generate multiple arbitrarily positioned tokens in parallel and to iteratively refine already generated tokens. However, several foundational aspects remain poorly understood. This project focuses on two core challenges: (1) designing and optimizing corruption kernels that govern the forward noising process in discrete diffusion models, and (2) addressing stability and normalization issues identified in Score Energy Discrete Diffusion (SEDD). Our aim is to develop a more rigorous theoretical and empirical understanding of these models to improve sample quality and stability in language tasks, paving the way for discrete diffusion to become a viable alternative to today’s state-of-the-art autoregressive language models.

Quote

You are that.

Back to Scholars