MIT EECS Analog Devices Undergraduate Research and Innovation Scholar
William T. Freeman
Electrical Engineering and Computer Science
Structured autoencoders for vision-as-inverse graphics Reconstructing and retrieving 3-D shapes from 2D images has long been a focus in computer vision. It is a challenging problem because of the ambiguity in 3-D to 2D projection. In this project we plan to tackle this problem by adopting the recent advance in deep convolutional generative adversarial network (DCGAN) and aim to generate 3-D shape from vectors. We also plan to train a convolutional neural network that maps an image to a vector accordingly. To further improve performance we plan to explore different constraints during network training to encode object continuity physical stability and contour consistency.
I took a graduate level machine learning class in the fall of my junior year and then started UROPs in the field of computer vision and deep learning. One of my UROPs studied the stability of pile of blocks in images and another studied the use of DCGAN network on 3-D objects. The most exciting part of this project is the opportunity of combining these experiences and building a high-performance object retrieval model.