The Action Verb Corpus (AVC) is a multimodal dataset of simple actions for robot learning. The extension introduced here is especially geared to supervised learning of actions from human motion data. Recorded are RGB-D videos of the test scene, grayscale videos from the user’s perspective, human hand trajectories, object poses and speech utterances. The three actions TAKE, PUT and PUSH are annotated with labels for the actions in different granularity.