Research Project Title:
Artificial Agents that Interact with Humans
abstract:While current state-of-the-art reinforcement learning algorithms are capable of learning close-to-optimal policies for specific tasks, these techniques are constrained to tasks with predetermined objectives and reward functions. In the real world, human preferences not only differ between individuals, but also change over time. We stipulate that humans condition their behavior on context clues from their environment with only limited layers of logical reasoning, and suggest learning a low-dimensional prior over the preferences of human evaluators. We hope to utilize our results to train robotic reinforcement learning agents that can effectively infer and respond to shifting human preferences in a few-shot setting.
I subsist on the allure of challenges. From my experiences as a UROP with the Improbable AI lab, and theoretical grounding in classes such as 6.867 and 6.884, I am excited to pursue the challenge of probing the boundaries of what RL can achieve.