Researchers can use the “Moments in Time” project to train AI systems to recognize and understand actions and events in videos
Imagine if we had to explain all of the actions that take place on Earth to aliens. We could provide them with non-fiction books or BBC documentaries. We could try to explain verbally what twerking is. But, really, nothing conveys an action better than a three second video clip.
Falling Asleep via GIPHY
Thanks to researchers at MIT and IBM, we now have a clearly labelled dataset of more than one million such clips. The dataset, called Moments in Time, captures hundreds of common actions that occur on Earth, from the beautiful moment of a flower opening to the embarrassing instance of a person tripping and eating dirt.
Tripping via GIPHY
(We’ve all been there.)
Moments in Time, however, wasn’t created to provide a bank of GIFs, but to lay the foundation for AI systems to recognize and understand actions and events in videos. To date, massive labeled image datasets such as ImageNet for object recognition and Places for scene recognition have played a major role in the development of more accurate models for image classification and understanding.
Pages: 1 2