IAD Index of Academic Documents
  • Home Page
  • About
    • About Izmir Academy Association
    • About IAD Index
    • IAD Team
    • IAD Logos and Links
    • Policies
    • Contact
  • Submit A Journal
  • Submit A Conference
  • Submit Paper/Book
    • Submit a Preprint
    • Submit a Book
  • Contact
  • Researcher
  • Volume:04 Issue:02
  • An Investigation On The Execution Of The Document Clustering Process On Internet News

An Investigation On The Execution Of The Document Clustering Process On Internet News

Authors : Metin Oktay Boz, Jale Bektaş
Pages : 113-119
View : 72 | Download : 61
Publication Date : 2024-12-31
Article Type : Other Papers
Abstract :Numerous investigations have focused on recognizing Internet news as valid documents. This study encompasses the application of text mining techniques to generate a TF-IDF matrix and the subsequent automatic identification and categorization of an optimal number of clusters. The research examines the impact of K-Means document clustering on internet news articles, integrating the User Engagement dataset which includes articles from various esteemed publishers. Prior to implementing the K-Means algorithm, several preprocessing steps were undertaken to prepare the TF-IDF matrix. Due to the absence of the content attribute data, the description attribute was selected for document clustering. During preprocessing, extraneous ASCII symbols, punctuation marks, line breaks, emails, mentions, internet extensions, stopwords, and words outside the 2 to 21 character range were removed. Words were stemmed to consolidate different forms of the same root. The Elbow method was employed on the TF-IDF matrix to determine the optimal number of clusters, followed by an analysis of results using prominent words and word clouds. Ultimately, five clusters of document counts 797, 408, 89, 364, and 8755 were identified.
Keywords : K-Means, TF-IDF, Kümeleme, Döküman Kümeleme

ORIGINAL ARTICLE URL

* There may have been changes in the journal, article,conference, book, preprint etc. informations. Therefore, it would be appropriate to follow the information on the official page of the source. The information here is shared for informational purposes. IAD is not responsible for incorrect or missing information.


Index of Academic Documents
İzmir Academy Association
CopyRight © 2023-2026