MIT EECS - Wertheimer Undergraduate Research and Innovation Scholar
“Learning Structures Underlying LongRange Transcriptional Regulation
David K. Gifford
Understanding transcription factors and their three dimensional interactions is extremely important to understanding gene regulation, the central dogma of biology, and cell identity. Transcription factors are proteins that bind to DNA and help regulate transcription, and the goal of this project is to use machine learning to model the interactions among transcription factors. First, we will use the ChIA-Pet sequencing technique to detect these interactions. Then, given a DNA sequence, we will build a logistic regression model that predicts interligation reads, or which transcription factors will bind to each other. By training our model on DNA fragments selected by ChIA-Pet, and testing on both held-out regions as well as biological experiments, we can verify the validity of our model.
I really enjoyed 6.036 this past spring, inspiring me to pursue machine learning further. This past summer I interned at Facebook on the Graph Search team, working on a machine learning project that involved using logistic regression models to predict search keywords for user posts.