Research Project Title:
Application of Performance Engineering for START
abstract:Researching and applying performance engineering to the START natural processing language model for the MIT InfoLab. Wikipedia is constantly growing, as every day more articles are added or edited. In order to have an accurate language model, START must periodically process all of Wikipedia, an event which can take up to 5 days. By improving on existing aspects of START, such as data collection and storage, along with researching applications of threading and concurrency, I hope to improve the time which START needs to sort through data from Wikipedia.
I'm participating in this SuperUROP in order to learn how the application of theoretical concepts applies to the reality of larger projects. This SuperUROP will give me the chance to apply material I've been learning into a much larger work. In the process of doing this, I'll learn more about the distinction between theoretical and practical programming, and hope to learn much more.