Modeling similarity in the age of data


The process of applying mathematics to the real world is undergoing a radical change through our ability to gather data at a massive scale. This is particularly true at Google, where we routinely process petabytes of human language, and interact with many millions of users. In this talk I describe some surprising realizations that arose from this data while trying to improve part of our search quality. It turns out that everything I thought I knew about similarity was wrong, and I should have been talking to psychologists.