The Semantic Vectors Package: New Algorithms and Public Tools for Distributional Semantics


Distributional semantics is the branch of natural language processing that attempts to model the meanings of words, phrases and documents from the distribution and usage of words in a corpus of text. In the past three years, research in this area has been accelerated by the availability of the Semantic Vectors package, a stable, fast, scalable, and free software package for creating and exploring concepts in distributional models.

This paper introduces the broad field of distributional semantics, the role of vector models within this field, and describes some of the results that have been made possible by the Semantic Vectors package. These applications of Semantic Vectors have so far included contributions to medical informatics and knowledge discovery, analysis of scientific articles, and even Biblical scholarship. Of particular interest is the recent emergence of models that take word order and other ordered structures into account, using permutation of coordinates to model directional relationships and semantic predicates.