Research Project Title:
Using Wikipedia Infoboxes for Natural Language Question Answering
abstract:START is a natural-language question-answering system developed at the Computer Science and Artificial Intelligence Laboratory (CSAIL). The system parses incoming questions into ternary expressions and matches the parsed queries against its knowledge base of natural language annotations also stored as ternary expressions. This project aims to provide an automatic way to generate high-precision annotations for Wikipedia articles to enable START to answer a broader range of questions. That includes finding a way to extract necessary information from many sources within Wikipedia and to generate annotations that will indeed match the potential queries and designing a robust system that can search efficiently across the whole of Wikipedia to allow quick real-time question-answering and that can compile data and generate annotations at adequate speed.
“I am participating in SuperUROP because I want to gain more research experience and to contribute to a project that is valuable and fascinating for both InfoLab and myself. The project involves many aspects of computer science within my interests, such as natural language processing, potentially machine learning, and system design. I am excited to apply my knowledge and interests to my project.”