Large-scale Video Classiﬁcation with Convolutional Neural Networks

Andrej Karpathy

George Toderici

Sanketh Shetty

Thomas Leung

Rahul Sukthankar

Li Fei-Fei

Proceedings of International Computer Vision and Pattern Recognition (CVPR 2014), IEEE

Download Google Scholar

Abstract

Convolutional Neural Networks (CNNs) have been established as a powerful class of models for image recognition problems. Encouraged by these results, we provide an extensive empirical evaluation of CNNs on large-scale video classification using a dataset of 1 million YouTube videos belonging to 487 classes. We study multiple approaches for extending the connectivity of a CNN in time domain to take advantage of local spatio-temporal information and suggest a multi-resolution, foveated architecture as a promising way of regularizing the learning problem and speeding up training. Our best spatio-temporal networks display significant performance improvements compared to strong feature-based baselines (55.3% to 63.9%), but only a surprisingly modest improvement compared to single-frame models (59.3% to 60.9%). We further study the generalization performance of our best model by retraining the top layers on the UCF-101 action Recognition dataset and observe significant performance improvements compared to the UCF-101 baseline model (63.3% up from 43.9%).

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Large-scale Video Classiﬁcation with Convolutional Neural Networks

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Large-scale Video Classiﬁcation with Convolutional Neural Networks

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities