Wsabie: Scaling Up To Large Vocabulary Image Annotation


Image annotation datasets are becoming larger and larger, with tens of millions of images and tens of thousands of possible annotations. We propose a strongly performing method that scales to such datasets by simultaneously learning to optimize precision at the top of the ranked list of annotations for a given image and learning a low-dimensional joint embedding space for both images and annotations. Our method, called Wsabie, both outperforms several baseline methods and is faster and consumes less memory.