Ayush Jain, Andrew Szot, Joseph Lim
A fundamental trait of intelligence is the ability to achieve goals in the face of novel circumstances. However, standard reinforcement learning typically assumes a fixed set of actions to choose from. Completing tasks with a new action space then requires time-consuming retraining. The ability to seamlessly utilize novel actions is crucial for adaptable agents. We take a step in this direction by introducing the problem of learning to generalize decision-making to unseen actions, based on action information acquired separately from the task. To approach this problem, we propose a two-stage framework where the agent first infers action representations from acquired action observations and then learns to use these in reinforcement learning with added generalization objectives. We demonstrate that our framework enables zero-shot generalization to new actions in sequential decision-making tasks, such as selecting unseen tools to solve physical reasoning puzzles and stacking towers with novel 3D shapes.