Bryan Jangeesingh
MIT EECS | Takeda Undergraduate Research and Innovation Scholar
Generalized Clinical Data Pipeline for Enhanced Health Records Analysis
2024–2025
Electrical Engineering and Computer Science
- Natural Language and Speech Processing
Collin M. Stultz
This project aims to enhance an existing clinical data modeling tool by integrating support for unstructured text data, focusing on large-scale datasets such as the MIMIC database. The enhancement will enable comprehensive multimodal modeling by incorporating both time-series clinical data and text-based data like clinical notes. We will also explore injecting domain knowledge from clinical textbooks and knowledge graphs to enrich the data representations, thereby improving the model’s predictive accuracy. The tool’s effectiveness will be evaluated in a transfer learning context, optimizing its applicability across diverse clinical tasks.
Through this SuperUROP, I aim to deepen my understanding of representation learning and machine learning, building on my experience from previous machine learning engineering internships. I also want to improve as a researcher, developing the intuition necessary to make groundbreaking discoveries in applied AI. This project excites me because it offers the opportunity to contribute to real-world advancements in healthcare.