Dong Jonathan Hyug Lim

Scholar Title

MIT EECS Fano Undergraduate Research and Innovation Scholar

Research Title

Collecting causative evidences of diseases through NLP technology




Stephanie Seneff


A critical examination of the research literature and of medical databases such as hospital discharge data will guide our task of discovering plausible biological mechanisms to link causative effects to disease. Causative effects can be characterized as the modification of a proteins molecular function or the dysfunction of an organ system, as reported in the literature and in scientific databases. Effects will be identified through multiple methodologies (e.g., NLP, statistics, and data mining). A solution will be developed to distinguish factual from hypothetical statements. The scientific literature will be mined for statements reporting on the positive/negative regulation of molecular, cellular, biochemical, hormonal and physiological processes.


I am interested in working on this project because I want to learn about mining information from unstructured data. Conducting this research in the medical domain would be a meaningful experience for me. I hope to learn about various NLP algorithms involved in solving these kinds of problems as well as using packages and external libraries that would be useful in NLP. I know Java which I will use for this project.

