AI

Modelling Score Distributions Without Actual Scores

Abstract

Score-distribution models are used for various practical purposes in search, for example for results merging and threshold setting. In this paper, the basic ideas of the score-distributional approach to viewing and analyzing the effectiveness of search systems are re-examined. All recent score-distribution modelling work depends on the availability of actual scores generated by systems, and makes assumptions about these scores. Such work is therefore not applicable to systems which do not generate or reveal such scores, or whose scoring/ranking approach violates the assumptions. We demonstrate that it is possible to apply at least some score-distributional ideas without access to real scores, knowing only the rankings produced (together with a single effectiveness metric based on relevance judgments). This new basic insight is illustrated by means of simulation experiments, on a range of TREC runs, some of whose reported scores are clearly unsuitable for existing methods.