Vanessa Xiao
Towards a Partially-Shared Latent Representation Space with Over-Parametrized Autoencoder-Based Architecture for Multi-Modal Data
2024ā2025
Electrical Engineering and Computer Science
- AI and Machine Learning
Caroline Uhler
The advent of new technologies and computational tools for generating and analyzing biological data has ushered in many discoveries in representation learning for medical applications. Autoencoders are able to learn representations of multi-modal data for many downstream applications. However, it is important to understand which parts of the two modalities hold similar information and which are modality specific. Having successfully achieved a partially-shared latent space for two modalities in addition to modality specific latent spaces through an over-parameterized autoencoder-based architecture, this project is to expand the model to more than two modalities with the modalities of SHARE-seq data, cell painting images, and drug perturbations across different cell types and time points.
Having done undergraduate research at MIT these past two years, I’m excited to embark on a more intensive research journey through SuperUROP this academic year within the field of computational biology. Iām interested in gaining more experience in representation learning for multi-modal data, and I hope to publish a paper on my work given any significant results.