Nilai Manish Sarda
MIT EECS | Undergraduate Research and Innovation Scholar
Topological Word Embeddings
2018–2019
Electrical Engineering and Computer Science
- Theory of Computation
Justin Solomon
In the field of natural language processing, we express the semantic content of a word using a high-dimensional embedding in Euclidean space. Many improvements in the field have come from engineering embedding methods that preserve notions of semantic distance and additivity. However, for specific tasks, recent research has shown that a flat manifold may not accurately represent the true distance relationships between words. In particular, for language tasks that involve clustering (like word sense disambiguation and topic modelling), we want to induce regions of locally high curvature on the underlying manifold. In this project, we will explore techniques to generate such embeddings and apply them to domain-specific tasks.
I am participating in SuperUROP because I want to gain valuable experience in the field of numerical optimization, especially as it applies to simulations. I have taken several high-level computer science and math courses, and I am excited to apply my knowledge towards my research!