Alex Kimn
MIT EECS | CS+HASS Undergraduate Research and Innovation Scholar
Application of Artificial Data Generation to Deep-Learning Based Sentence Error Correction
2018–2019
EECS
- CS+HASS
Dorothy W. Curtis
Takako Aikawa
One of the main problems in designing a good machine learning model is selecting an appropriate set of training data. When the pool of applicable data is small, creating a satisfactory model can be challenging. The goal of our project is thus to determine whether adding systematically produced sample data can improve the quality of the final model. In particular, we will determine whether machine-generated sentence pairs can improve the accuracy of a Seq2Seq neural network in translating between errored and correct Japanese sentences. The final application of this project is to create an online training tool to provide beginning language-learners with prompt and useful feedback on their mistakes.
By participating in this SuperUROP project, I am aiming to gain experience in solving a problem that combines my fields of interest. I am excited to apply my background in machine learning to a long-term research project, which I hope will deepen my understanding of natural language processing.