Design of information retrieval experiments: the sufficient topic set size for providing an adequate level of confidence

Home Page
About
Submit A Journal
Submit A Conference
Submit Paper/Book
- Submit a Preprint
- Submit a Book
Contact

Turkish Journal of Electrical Engineering and Computer Science
Volume:21 Issue:Sup.2
Design of information retrieval experiments: the sufficient topic set size for providing an adequate...

Design of information retrieval experiments: the sufficient topic set size for providing an adequate level of confidence

Authors : Bekir Taner DİNÇER

Pages : 2218-2232

Doi:10.3906/elk-1203-20

View : 15 | Download : 11

Publication Date : 0000-00-00

Article Type : Research Paper

Abstract :In the current design of information retrieval insert ignore into journalissuearticles values(IR); experiments, a sample of 50 topics is generally agreed to be sufficient in size to perform dependable system evaluations. This article presents the detailed and formal explanation of how the second fundamental theorem of probability, the central limit theorem, can be used for the estimation of the sufficient size of a topic sample. The research performed in this article, using past Text Retrieval Conference data, reveals that, on average, 50 topics will be sufficient to provide a confidence level at or above 95% if the null hypothesis of an equal population mean average precision insert ignore into journalissuearticles values(MAP); insert ignore into journalissuearticles values(H0); is rejected for 2 IR systems having an observed difference in the MAP of 0.035 or more, whereas, in contrast, previous empirical research suggests a difference in the MAP of 0.05 or more. This study also shows that, for individual system pairs, the sample size required to provide 95% confidence on a declared significance may range from a size as small as 10 to a size as large as 722. Thus, for the design of IR experiments, it agrees with the common view that relying on average figures as a rule of thumb may well be misleading.
Keywords : Information retrieval system evaluation, topic set size, central limit theorem, generalizability theory

ORIGINAL ARTICLE URL

VIEW PAPER (PDF)

* There may have been changes in the journal, article,conference, book, preprint etc. informations. Therefore, it would be appropriate to follow the information on the official page of the source. The information here is shared for informational purposes. IAD is not responsible for incorrect or missing information.

Index of Academic Documents
İzmir Academy Association
CopyRight © 2023-2025