Victoria Dean
MIT EECS ā Cisco Undergraduate Research and Innovation Scholar
Understanding Video with Deep Convolutional Neural Networks
2015ā2016
Antonio Torralba
Deep convolutional neural networks (CNNs) have drastically improved the state of the art for image recognition and have also shown success in finer grained vision tasks, such as localization and image captioning. However, video processing with CNNs is a less explored area, since deep learning proves most successful when large amounts of labeled data are available, and labeling multiple frames from a video is a time consuming task. I hope to utilize the enormous amounts of unlabeled
video to understand video and predict what will happen next. In addition, once our models are trained, I want to examine what our models have learned about videos and the world.
I first became interested in computer vision during high school, when I gave basket-tracking a shot for a basketball-playing robot. I really enjoyed the inference class I took last year (6.008), and Iām looking forward to applying deep neural nets to video.