Research Project Title:
Application of Artificial Data Generation to Deep-Learning Based Sentence Error Correction
abstract:One of the main problems in designing a good machine learning model is selecting an appropriate set of training data. However, when the pool of applicable data is small, creating a satisfactory model can be challenging. The goal of our project is thus to determine whether adding systematically produced sample data can improve the quality of the final model. In particular, we will determine whether machine-generated sentence pairs can improve the accuracy of a Seq2Seq neural network in translating between errored and correct Japanese sentences. The final application of this project is to create an online training tool to provide beginning language learners with prompt and useful feedback on their mistakes.
By participating in this SuperUROP, I am aiming to gain experience in solving a problem that combines my fields of interest. I am excited to apply my background in machine learning to a long-term research project and I hope that it will deepen my understanding of natural language processing.