John Yang
Eric and Wendy Schmidt Center Funded Research and Innovation Scholar
Probabilistic Generative Modeling of Protein Structure and Sequence
2022–2023
Electrical Engineering and Computer Science
- Computational Fabrication and Manufacturing
Tommi S. Jaakkola
Protein engineering holds promise for advancements in biotechnologies set to benefit human health including vaccine and drug development. Existing methods rely on domain knowledge alongside extensive trial and error to engineer proteins with desired functions and properties. Our research will focus on using machine learning to build probabilistic models of protein structure and sequence. We will apply recent advances in deep generative models to model co-dependence between sequence and structure that facilitate downstream applications in motif-scaffolding and inverse protein folding. Our aim will be to demonstrate the efficacy of machine learning to improve success rate in generating plausible proteins using in-silico metrics.
SuperUROP gives me the opportunity to channel my passions and coursework in machine learning and biology into a structured research project. As someone considering grad school, I hope to get a taste of the real-world research life cycle through listening to talks from top researchers, writing proposals, and giving poster presentations. Since SuperUROP counts as a class, it’ s easier to carve out time for research during the busy semester.