Research Project Title:
Learning Representations for 3D Scene View Synthesis
abstract:Estimation of 3D structures from 2D image sequences has been attracting increasing attention in the research community. Conventional Structure from Motion (SfM) algorithms is widely used to establish the spatial relationship between images in order to reconstruct 3D point clouds. Some recent works also attempt to solve SfM using deep learning. However, to generate a 3D representation, current efforts all require a large sequence of scene-specific 2D input images, along with their corresponding camera parameters, making the algorithms less generalizable. Therefore, I propose a 2-step deep supervised approach in generating new views of 3D objects and scenes given 2D input images: first matching multi-view input images using feature-metric bundle adjustment to optimize scene structure and camera motion jointly, and then generating unseen views of the same scene by encoding the input images as a latent 3D voxel representation and projecting into 2D. I believe this will improve the quality, scope, and generalization capability of 3D scene view synthesis process, and may have further applications in other fields, such as 3D scene representation.
"Through participating in SuperUROP, I hope to familiarize myself with current research trends and to develop better research skills, such as writing papers and doing presentations. Through this project, I am excited to learn more about generative models, image synthesis, 3D deep learning, and 3D scene representation. I look forward to applying computer vision algorithms to real world modeling, as well as the diverse applications that our research can lead to."