MIT EECS — MITRE Undergraduate Research and Innovation Scholar

Theory of Mind from Vision and Language Analysis




Boris Katz


Computer systems still lag far beyond humans in the areas of emotional intelligence and theory of mind. My project will use analysis of natural language and body language in tandem to gauge mood and read subtext. I will build a system that, given a video of two humans interacting, will analyze their words, body language, and actions to predict their intentions and the state of their relationship (cold/warm, relative power/respect, relaxed/anxious, etc). The system I will build will search within videos for a finite predefined set of these social cues at first, and try to correlate them to build a coherent estimate of the situation. Once the model has been trained on a large data set, it will be able to discover new marker for the possible social situations.


I am excited to push the boundaries of the intersection of computer vision and NLP. I got my start in research as a computer vision intern in Draper Laboratory summer 2013, where I built a hazard detection system. This summer, I worked on Google Search’s semantic- parsing team to build a grammar inducer and experienced the complexities of natural language. I can’t wait to sink my teeth into academic research!

