Exploring the limits of language modeling


This paper shows recent advances for large scale neural language modeling, a task central to language understanding. Our goal is to show how well large neural language models can perform on a large LM benchmark corpus, for which we chose the One Billion Word Benchmark. Using various techniques, our best single model significantly improves state-of-the-art perplexity from 51.3 to 30.0, while an ensemble of models sets a new record by improving perplexity from 41.0 to 23.7.