AI

Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction

Abstract

(first author email should be xuwei@cs.nyu.edu) Abstract: Distant supervision has attracted recent in- terest for training information extraction systems because it does not require any human annotation but rather employs ex- isting knowledge bases to heuristically la- bel a training corpus. However, previous work has failed to address the problem of false negative training examples misla- beled due to the incompleteness of knowl- edge bases. To tackle this problem, we propose a simple yet novel framework that combines a passage retrieval model using coarse features into a state-of-the-art rela- tion extractor using multi-instance learn- ing with fine features. We adapt the in- formation retrieval technique of pseudo- relevance feedback to expand knowledge bases, assuming entity pairs in top-ranked passages are more likely to express a rela- tion. Our proposed technique significantly improves the quality of distantly super- vised relation extraction, boosting recall from 47.7% to 61.2% with a consistently high level of precision of around 93% in the experiments.