Battushig Myanganbayar
MIT EECS Undergraduate Research and Innovation Scholar
Learning to understand language from captioned videos
2016–2017
Electrical Engineering and Computer Science
- Artificial Intelligence & Machine Learning
Boris Katz
Parsing natural language from annotated videos
Semantic parsing is an approach to natural language processing which facilitates natural language interaction between human and machine. But current capabilities of semantic parsing are limited to basic tasks such as “Who is Ted Cruz?” Significant amount of annotated data and hand tuning of features is required to capture all semantic complexities of human language. In this project we will build a parsing model based on videos and their short descriptions. By doing so this semantic parser is both constrained by language grammar and physical interaction between objects in the video. The additional information from physical interaction reduces the amount of annotated data and makes it easier to build a general model with no hand tuning as laws of physical interactions are generic.
I’ve always dreamed of building a machine that could learn all the complexities of natural language. I believe that the courses in I’ve taken natural language processing and previous internship focused heavily on machine learning techniques prepare me well for my SuperUROP and I’m very excited to be closer to fulfilling my dream.