Mohamed Hassan Kane
MIT EECS Undergraduate Research and Innovation Scholar
Optimization-based Image-to-Image translation
Electrical Engineering and Computer Science
From text to image: Learning to generate and retrieve image across modalities
Recent progresses in the area of deep learning have brought us important building blocks on which more sophisticated computer vision tasks can be solved. The goal of this project is to strengthen computers’ ability to reason about images and objects by working on a framework to generate image from text. The work will build on top of recent work on learning alignment across modalities where the same object is represented by the same embedding be it in a clip-art drawing or picture. With deconvolution networks we can now generate an image from a general feature description obtained using a convolutional neural networks. The present work will go a step further and generate embedding from texts which will then be used to generate an image.
Over the last two years I have been fascinated by the field of Artificial Intelligence for its mixing of mathematical intuition and practical applications. After taking classes in statistical inference and machine learning during my sophomore year and doing research on machine learning in healthcare in my junior year I want to explore how to make computers have visual intelligence.