Alert button
Picture for Will Kay

Will Kay

Alert button

The Kinetics Human Action Video Dataset

May 19, 2017
Will Kay, Joao Carreira, Karen Simonyan, Brian Zhang, Chloe Hillier, Sudheendra Vijayanarasimhan, Fabio Viola, Tim Green, Trevor Back, Paul Natsev, Mustafa Suleyman, Andrew Zisserman

Figure 1 for The Kinetics Human Action Video Dataset
Figure 2 for The Kinetics Human Action Video Dataset
Figure 3 for The Kinetics Human Action Video Dataset
Figure 4 for The Kinetics Human Action Video Dataset

We describe the DeepMind Kinetics human action video dataset. The dataset contains 400 human action classes, with at least 400 video clips for each action. Each clip lasts around 10s and is taken from a different YouTube video. The actions are human focussed and cover a broad range of classes including human-object interactions such as playing instruments, as well as human-human interactions such as shaking hands. We describe the statistics of the dataset, how it was collected, and give some baseline performance figures for neural network architectures trained and tested for human action classification on this dataset. We also carry out a preliminary analysis of whether imbalance in the dataset leads to bias in the classifiers.

Viaarxiv icon

Teaching Machines to Read and Comprehend

Nov 19, 2015
Karl Moritz Hermann, Tomáš Kočiský, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, Phil Blunsom

Figure 1 for Teaching Machines to Read and Comprehend
Figure 2 for Teaching Machines to Read and Comprehend
Figure 3 for Teaching Machines to Read and Comprehend
Figure 4 for Teaching Machines to Read and Comprehend

Teaching machines to read natural language documents remains an elusive challenge. Machine reading systems can be tested on their ability to answer questions posed on the contents of documents that they have seen, but until now large scale training and test datasets have been missing for this type of evaluation. In this work we define a new methodology that resolves this bottleneck and provides large scale supervised reading comprehension data. This allows us to develop a class of attention based deep neural networks that learn to read real documents and answer complex questions with minimal prior knowledge of language structure.

* Appears in: Advances in Neural Information Processing Systems 28 (NIPS 2015). 14 pages, 13 figures 
Viaarxiv icon