Ka Wai (Joanne) Lee
MIT EECS | Keel Foundation Undergraduate Research and Innovation
Generating Annotations from Wikipedia to Answer Questions
2017–2018
Electrical Engineering and Computer Science
- Natural Language and Speech Processing
Boris Katz
Much information on the Internet exists in the form of text. In Wikipedia, specifically, there are textboxes called infoboxes, with mappings of attributes to values that contain highly condensed information. We are trying to organize that data for improved, more natural data access. This project focuses on generating annotations, or small, understandable facts, from Wikipedia infoboxes. I will use morphological analysis of relations and other natural language processing (NLP) techniques to automatically generate annotations. The algorithm will then be used to scour Wikipedia pages to generate annotations that will contribute to the START Natural-Language Question Answering System, which answers natural-language questions by searching in its knowledge base.
Through this SuperUROP program, I want to work with the START System to see how a system can understand words as knowledge. I will be working with natural-language annotations and learning to understand the START system. I am excited about this project because I have always been curious to see how a machine can understand the words we speak.