Gianni J. Tipan
MIT EECS | Mason Undergraduate Research and Innovation Scholar
Inferring and Manipulating Object-Level Models from Visual Inputs via Segment Anything (SAM) and NeRF
2023-2024
Electrical Engineering and Computer Science
- Graphics and Vision
Frederic P. Durand
This project aims to derive 3D object models from visual inputs like single-view videos or multi-angle photos. By segmenting 3D scenes into individual objects, we can enhance AR/VR asset acquisition, 3D editing, and robotics. Using known segmentation and object categories aids in understanding their geometry and appearance. We’ll leverage current computer vision segmentation models and novel-view synthesis techniques such as NeRFs and triplane representations. Our primary tool will be the Segment Anything Model (SAM) for monocular segmentations and NeRF Shop prototypes, aiming to highlight and manipulate objects within scenes and make them moveable. Additional objectives encompass determining objects’ physical attributes including geometry, appearance, and lighting.
I’m very excited to partake in a long-term research project through this SuperUROP. I’ve enjoyed editing videos and images before, and it’s been thrilling to observe the evolving capacity of editing throughout years. I hope to further my knowledge of rising computer vision tools and computer vision’s nuances, alongside contributing to this field.