Anisha Parsan
AF3Design: De Novo Protein Design with AlphaFold3
2025–2026
Electrical Engineering and Computer Science
- AI for Healthcare and Life Sciences
- AI and Machine Learning
Uhler, Caroline
De novo protein design is a central challenge in computational biology, with broad applications in therapeutics, diagnostics, and enzyme engineering. Recent advances have shown that AlphaFold-based confidence metrics, such as ipTM and pTM, correlate strongly with experimental measures of binding affinity and structural stability, motivating their use as surrogate losses in protein optimization frameworks. Building on these insights, new approaches such as BindCraft have introduced additional differentiable proxies (e.g., pLDDT, contact-based losses, radius of gyration loss) that capture structural fidelity and biophysical plausibility.
This project focuses on developing AF3Design, a protein design framework that directly leverages the speed and accuracy of AlphaFold3. A key research direction is to incorporate diverse loss functions into the design loop – including energy-inspired metrics (pTMEnergy, pAEnergy), confidence-based losses (pLDDT), structural regularizers (contact loss, radius of gyration), and potentially novel AF3-specific surrogates – to provide a flexible optimization landscape that balances stability, foldability, and function.
In parallel, the project will systematically explore different search algorithms for navigating protein sequence space. Current experiments have shown that evolutionary search with AF3 can already generate high-confidence nanobody and miniprotein candidates. Building on this, we will investigate reinforcement learning, discrete gradient descent in energy landscapes, and hybrid strategies that integrate proteinMPNN-based mutation operators or active-learning loops. By benchmarking these algorithms across miniprotein, nanobody, and enzyme design tasks against existing methods (BindCraft, RFAntibody, RFDiffusion2), we aim to clarify which combinations of losses and search strategies yield the most efficient and generalizable design pipeline.
I’m doing SuperUROP because I’m deeply interested in generative approaches in structural biology and genomics, and in how optimization methods can be used to frame these problems in new ways. What excites me most is the potential of deep learning not just to build powerful models, but to ask the right biological questions and to lead to tangible improvements in experimental settings. More broadly, I see this project as a way to explore what research could look like for me long term – whether that means pursuing a PhD or working in industry – and it feels like a strong model for both paths. I would also love to work toward writing a paper by the end of this experience.
