MIT EECS Lincoln Labs Undergraduate Research and Innovation Scholar
William T. Freeman
Electrical Engineering and Computer Science
Flexible Object Representations for Physical Scene Understanding How can a vision system acquire common sense knowledge from visual input? Our goal is to infer physical object properties in unlabeled videos by combining physics knowledge with machine learning. We aim to build a system that consists of a powerful low-level visual recognition system and a physics simulator. The system takes a two-step approach: it first recovers 3-D structural properties of the objects using a combination of convolutional neural networks and intermediate 2D representations such as edge maps and surface normal maps. It then uses a number of unit tests such as evaluating the scene's stability discovering a stable configuration of objects and predicting future object motions to refine its estimation by comparing the output of the physics engine and the ground truth.
I started work on my SuperUROP project last year as a regular UROP. The project covers several major computer vision topics and I have learned about a lot of the current state-of-the-art computer vision research from working on it. I am excited to continue this project and explore the many interesting directions we can take.