- Turkish Journal of Electrical Engineering and Computer Science
- Volume:19 Issue:6
- Investigation of Luhn`s claim on information retrieval
Investigation of Luhn`s claim on information retrieval
Authors : İlker KOCABAŞ, Bekir Taner DİNÇER, Bahar KARAOĞLAN
Pages : 993-1004
View : 24 | Download : 8
Publication Date : 0000-00-00
Article Type : Research Paper
Abstract :In this study, we show how Luhn`s claim about the degree of importance of a word in a document can be related to information retrieval. His basic idea is transformed into z-scores as the weights of terms for the purpose of modeling term frequency insert ignore into journalissuearticles values(tf); within documents. The Luhn-based models represented in this paper are considered as the TF component of proposed TF \times IDF weighing schemes. Moreover, the final term weighting functions appropriate for the TF \times IDF weighting scheme are applied to TREC-6, -7, and -8 databases. The experimental results show relevance to Luhn`s claim by having high mean average precision insert ignore into journalissuearticles values(MAP); for the terms with frequencies around the mean frequency of terms within a document. On the other hand, the weighting, which significantly discriminates the importance between low/high frequencies and medium frequencies, degrades the retrieval performance. Therefore, any weighting scheme insert ignore into journalissuearticles values(TF); that is directly proportional to tf has a probability of high retrieval performance, if this can optimally indicate the difference of the importance regarding tf values and also optimally eliminate the terms that have high frequencies.Keywords : Luhn, information retrieval, term weighting, indexing