IAD Index of Academic Documents
  • Home Page
  • About
    • About Izmir Academy Association
    • About IAD Index
    • IAD Team
    • IAD Logos and Links
    • Policies
    • Contact
  • Submit A Journal
  • Submit A Conference
  • Submit Paper/Book
    • Submit a Preprint
    • Submit a Book
  • Contact
  • PressAcademia Procedia
  • Volume:5 Issue:1
  • INFRASTRUCTURE WITH R PACKAGE FOR ANOMALY DETECTION IN REAL TIME BIG LOG DATA

INFRASTRUCTURE WITH R PACKAGE FOR ANOMALY DETECTION IN REAL TIME BIG LOG DATA

Authors : Zirje Hasani
Pages : 181-189
Doi:10.17261/Pressacademia.2017.588
View : 21 | Download : 14
Publication Date : 2017-06-30
Article Type : Research Paper
Abstract :Analyzing and detecting anomalies in huge amount of data are a big challenge. On one hand we are faced with the problem of storing a large amount of data, on the other to process it and detect anomalies in reasonable or even real time. Real time analytics can be defined as the capacity to use all available enterprise data and sources in the moment they arrive or happen in the system. In this paper, we present an infrastructure that we have implemented in order to analyze data from big log files in real time. Also we present algorithms that are used for anomaly detection in big data. The algorithms are implemented in R language. The main components of the infrastructure are Redis, Logstash, Elasticsearch, elastic-R client and Kibana. We explore implementation of several filters in order to post-process the log information and produce various statistics that suit our needs in analyzing log files containing SQL queries from a big national system in education. The post-processing of the SQL queries is mainly focused on preparing the log information in adequate format and information extraction. The other interesting part of the paper is to compare the anomaly detection algorithms and to conclude which of them is better to us for our needs. Also we add the elastic-R client to the infrastructure we develop for big data analytic in order to detect anomalies. The purpose of the analysis is to monitor performance and detect anomalies in order to prevent possible problems in real time.  
Keywords : Big data, anomaly detection elgorithm, log data, logstash, elasticsearch, elastic R client, kibana

ORIGINAL ARTICLE URL
VIEW PAPER (PDF)

* There may have been changes in the journal, article,conference, book, preprint etc. informations. Therefore, it would be appropriate to follow the information on the official page of the source. The information here is shared for informational purposes. IAD is not responsible for incorrect or missing information.


Index of Academic Documents
İzmir Academy Association
CopyRight © 2023-2025