The R Journal: accepted article

This article will be copy edited and may be changed before publication.

anomalyDetection: Implementation of Augmented Network Log Anomaly Detection Procedures PDF download
Robert J. Gutierrez, Bradley C. Boehmke, Kenneth W. Bauer, Cade M. Saie and Trevor J. Bihl

Abstract As the number of cyber-attacks continues to grow on a daily basis, so does the delay in threat detection. For instance, in 2015, the Office of Personnel Management discovered that approximately 21.5 million individual records of Federal employees and contractors had been stolen. On average, the time between an attack and its discovery is more than 200 days. In the case of the OPM breach, the attack had been going on for almost a year. Currently, cyber analysts inspect numerous potential incidents on a daily basis, but have neither the time nor the resources available to perform such a task. anomalyDetection aims to curtail the time frame in which anomalous cyber activities go unnoticed and to aid in the efficient discovery of these anomalous transactions among the millions of daily logged events by i) providing an efficient means for pre-processing and aggregating cyber data for analysis by employing a tabular vector transformation and handling multicollinearity concerns; ii) offering numerous built-in multivariate statistical functions such as Mahalanobis Distance, Factor Analysis, Principal Components Analysis to identify anomalous activity, iii) incorporating the pipe operator (%>%) to allow it to work well in the tidyverse workflow. Combined, anomalyDetection offers cyber analysts an efficient and simplified approach to break up network events into time-segment blocks and identify periods associated with suspected anomalies for further evaluation.

Received: 2017-05-05; online 2017-08-04
CRAN packages: anomalyDetection, magrittr, tidyverse , CRAN Task Views implied by cited CRAN packages: WebTechnologies


CC BY 4.0
This article is licensed under a Creative Commons Attribution 4.0 International license.

@article{RJ-2017-039,
  author = {Robert J. Gutierrez and Bradley C. Boehmke and Kenneth W.
          Bauer and Cade M. Saie and Trevor J. Bihl},
  title = {{anomalyDetection: Implementation of Augmented Network Log
          Anomaly Detection Procedures}},
  year = {2017},
  journal = {{The R Journal}},
  url = {https://journal.r-project.org/archive/2017/RJ-2017-039/index.html}
}