Mining Top-K High Occupancy Itemsets
Authors : İrfan Yıldırım
Pages : 1723-1730
Doi:10.34248/bsengineering.1744061
View : 150 | Download : 237
Publication Date : 2025-11-15
Article Type : Research Paper
Abstract :High-occupancy itemset mining aims to identify itemsets within databases whose occupancy values satisfy a specified minimum threshold set by the user. However, selecting a suitable threshold can be difficult for users. If the threshold is set too low, it can result in too many itemsets, causing inefficiencies in terms of time and memory usage during the mining process and making it harder for decision-makers to interpret the results. On the other hand, setting the threshold too high may lead to the omission of valuable itemsets. To overcome this limitation, this paper extends the classical high-occupancy itemset mining problem into the top-k high-occupancy itemset mining problem and proposes an algorithm called TKHOIM (top-k high-occupancy itemset miner) that applies three strategies to address the problem efficiently. In this approach, users can directly specify the number of itemsets to be discovered, denoted as k, without the need to define a minimum occupancy threshold. Experimental results demonstrate that TKHOIM is effective in discovering the top-k high-occupancy itemsets.Keywords : Data mining, Itemset, Occupancy, Top-k, Pruning strategy
ORIGINAL ARTICLE URL
