IAD Index of Academic Documents
  • Home Page
  • About
    • About Izmir Academy Association
    • About IAD Index
    • IAD Team
    • IAD Logos and Links
    • Policies
    • Contact
  • Submit A Journal
  • Submit A Conference
  • Submit Paper/Book
    • Submit a Preprint
    • Submit a Book
  • Contact
  • Turkish Journal of Electrical Engineering and Computer Science
  • Volume:25 Issue:3
  • New use of the HITS algorithm for fast web page classification

New use of the HITS algorithm for fast web page classification

Authors : MOHAMED NADJIB MEADI, MOHAMED CHAOUKI BABAHENINI, ABDELMALIK TALEB AHMED
Pages : 2015-2032
View : 13 | Download : 11
Publication Date : 0000-00-00
Article Type : Research Paper
Abstract :The immense number of documents published on the web requires the utilization of automatic classifiers that allow organizing and obtaining information from these large resources. Typically, automatic web pages classifiers handle millions of web pages, tens of thousands of features, and hundreds of categories. Most of the classifiers use the vector space model to represent the dataset of web pages. The components of each vector are computed using the term frequency inversed document frequency insert ignore into journalissuearticles values(TFIDF); scheme. Unfortunately, TFIDF-based classifiers face the problem of the large-scale size of input data that leads to a long processing time and an increase in resource requests. Therefore, there is an increasing demand to alleviate these problems by reducing the size of the input data without influencing the classification results. In this paper, we propose a novel approach that improves web page classifiers by reducing the size of the input data insert ignore into journalissuearticles values(i.e. web pages and feature reduction); by using the hypertext induced topic search insert ignore into journalissuearticles values(HITS); algorithm. We employ HITS results for weighting remaining features. We evaluate the performance of the proposed approach by comparing it with the TFIDF-based classifier. We demonstrate that our approach significantly reduces the time needed for classification.
Keywords : Hypertext induced topic search, link analysis, support vector machine, web mining

ORIGINAL ARTICLE URL
VIEW PAPER (PDF)

* There may have been changes in the journal, article,conference, book, preprint etc. informations. Therefore, it would be appropriate to follow the information on the official page of the source. The information here is shared for informational purposes. IAD is not responsible for incorrect or missing information.


Index of Academic Documents
İzmir Academy Association
CopyRight © 2023-2025