Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks

Minmin Chen

Jeffrey Pennington

Sam Schoenholz

ICML (2018)

Download Google Scholar

Abstract

Gated recurrent neural networks have becoming the default for modeling sequence data in various domains. The underlying mechanism enables such remarkable performance is however not well understood. We aim at de-mystifying the difference in trainability of vanilla and gated RNNs. We introduce a new gated variant of RNNs, the minimal recurrent neural network (minimalRNN). Its simplistic update enables us to analyze the signal propagation using mean field theory and random matrix theory. We develop a closed-form critical initialization scheme that achieves dynamical isometry in both vanilla and minimal RNN, which results in significant improvement in training RNNs. In contrast to the narrow region of good random initialization in vanillaRNN, minimalRNN enjoys a much broader range of good initialization (some easily achievable by adapting the bias term only), which explains the better trainability of gated RNNs. We demonstrate that minimalRNN achieves comparable performance to its more complex counterpart, such as LSTMs or GRUs on language modeling task.

Research Areas

Machine Intelligence

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities