The Solitude of Relevant Documents in the Pool

Authors:

Aldo Lipani

Mihai Lupu

Evangelos Kanoulas

Allan Hanbury

Type:

Speech with proceedings

Proceedings:

Proceedings of the 2016 ACM on International Conference on the Theory of Information Retrieval

Publisher:

ACM

Pages:

1989 - 1992

ISBN:

Year:

2016

Abstract:

Pool bias is a well understood problem of test-collection based benchmarking in information retrieval. The pooling method itself is designed to identify all relevant documents. In practice, 'all' translates to `as many as possible given some budgetary constraints' and the problem persists, albeit mitigated. Recently, methods to address this pool bias for previously created test collections have been proposed, for the evaluation measure precision at cut-off (P@n). Analyzing previous methods, we make the empirical observation that the distribution of the probability of providing new relevant documents to the pool, over the runs, is log-normal (when the pooling strategy is fixed depth at cut-off). We use this observation to calculate a prior probability of providing new relevant documents, which we then use in a pool bias estimator that improves upon previous estimates of precision at cut-off. Through extensive experimental results, covering 15 test collections, we show that the proposed bias correction method is the new state of the art, providing the closest estimates yet when compared to the original pool.

TU Focus:

Information and Communication Technology

Reference:

A. Lipani, M. Lupu, E. Kanoulas, A. Hanbury:
"The Solitude of Relevant Documents in the Pool";
Vortrag: International Conference on the Theory of Information Retrieval, Indianapolis, Indiana, USA; 24.10.2016; in: "Proceedings of the 2016 ACM on International Conference on the Theory of Information Retrieval", ACM, (2016), S. 1989 - 1992.

Zusätzliche Informationen

PDF Link:

http://publik.tuwien.ac.at/files/publik_253801.pdf

Last changed:

10.12.2016 16:42:58

TU Id:

253801

Accepted:

Accepted

Invited:

Department Focus:

Business Informatics

Info Link:

https://publik.tuwien.ac.at/showentry.php?ID=253801&lang=1

Abstract German:

Author List:

A. Lipani, M. Lupu, E. Kanoulas, A. Hanbury

Main menu

The Solitude of Relevant Documents in the Pool

Who's online

Contact

Offenlegung gemäß § 25 Mediengesetz:

Datenschutzerklärung

In case of problems

The Solitude of Relevant Documents in the Pool

Search form

Who's online

Contact

Offenlegung gemäß § 25 Mediengesetz:

Datenschutzerklärung

In case of problems