MIT EECS | Texas Instruments Undergraduate Research and Innovation Scholar
Machine Translation of x86 Assembly to Source Code Comments
- Computer Systems
This project is at the intersection of machine learning and reverse engineering binary executables. The goal is to have a model that takes in executable binary data and then generates appropriate source code comments that describe what the assembly instructions are doing at a high level in order to aid reverse engineering efforts. The first phase will require developing scripts that extract data using open source x86 binaries and their sourcecode files. By taking regions of assembly code and looking at their corresponding locations in a source file, we can create training sets. The main goal is to identify the neural network architecture that gives the most accurate outputs and to optimize it as much as possible.
I have enjoyed working on a related UROP for the past 3 years and am ready to take research work more seriously by participating in SuperUROP and having the opportunity to share meaningful results with a broader audience. I hope to continue gaining experience working in computer systems security and make a positive contribution to existing work. I am most excited about applying machine learning concepts I have previously only seen in class.