MIT EECS Undergraduate Research and Innovation Scholar
Comparison of Representations for Clinical Texts
Electrical Engineering and Computer Science
- Machine Learning
Improved Word Embeddings for Analysis of Clinical Texts
Clinical texts are produced in large volumes during the day-to-day work of medical professionals. The ability for intelligent programs to automatically process and understand these texts could provide opportunities for deep research or applications to aide patients and doctors alike. Within Natural Language Processing word embeddings are used to translate plaintext words into low-dimensional vectors allowing for more complex processing. However specific attributes of clinical text reduces the effectiveness of standard text embeddings. This project seeks to develop and evaluate improved embeddings for specified use in clinical text analysis tasks. We will initially focus on “spectral embedding methods” which have been successful in learning embeddings on similar datasets.
I’m excited by the ability for machine learning to directly and positively affect the lives of everyday people and this project (and the work by the Clinical Decision Making Group) moves in that direction. Through the SuperUROP program I hope to become more confident in my research abilities and potentially find a candidate for my graduate thesis work.