Image Transformer

Niki J. Parmar

Ashish Vaswani

Jakob Uszkoreit

Lukasz Kaiser

Noam Shazeer

Alexander Ku

Dustin Tran

International Conference on Machine Learning (ICML) (2018)

Download Google Scholar

Abstract

Recent work demonstrated significant progress towards modeling the distribution of natural images with tractable likelihood using deep neural networks. This was achieved by modeling the joint distribution of pixels in the image as the product of conditional distributions, thereby turning it into a sequence modeling problem, and applying recurrent or convolutional neural networks to it. In this work we instead build on the Transformer, a recently proposed network architecture based on self-attention, to model the conditional distributions in similar factorizations. We present two extensions of the network architecture, allowing it to scale to images and to take advantage of their two-dimensional structure. While conceptually simple, our generative models trained on two image data sets are competitive with or outperform the current state of the art on two different data sets, CIFAR-10 and ImageNet, as measured by log-likelihood. We also present results on image super-resolution with large magnification ratio with an encoder-decoder configuration of our architecture. In a human evaluation study, we show that our super-resolution models improve over previously published autoregressive super-resolution models in how often they fool a naive human observer by a factor of three. Lastly, we provide examples of images generated or completed by our various models which, following previous work, we also believe to look pretty cool.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Image Transformer

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Image Transformer

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities