Metadata Causal Inference of Concept Drifts in Probabilistic-Relational Machine Learning: Predictive Analytics Using Graph Theory Methods in Cyber-Defense

Originally published on 2023-03-26 under a CC BY 4.0

Authors

Karriem Perry

Summary

This research describes a novel approach to implementing cyber-defense protocols using supervised statistical-relational machine learning and lifted-inference in neural network architectures to detect concept drift anomalies. Data concept drift, or alternatively, concept shift, is a common yet difficult abnormality to detect in most data. Particularly metadata from adversarial cyber-attacks data originating from non-stationary environments, as detailed in the introduction. The Material and Methods and the Related Works sections describe the manuscript’s foundation in statistical-relational machine learning, otherwise known as relational machine learning, and its intrinsic suitability for identifying variables that may contain attributes, objects, entities and the like which contribute to concept drift irregularities during Exploratory Data Analysis (EDA) or Graphical Data Analysis (GDA) in detection of Structured Query Language Injection Attacks (SQLIA) probability. Moreover, balancing the tradeoffs when developing, Directed Acyclic Graphs (DAGs), Bayesian Networks (Bnets) and other network illustrations, demand increased computational time complexity. Lifted inference and prior probabilities are then introduced by the reproducible Analytics Solutions Unified Method for Data Mining/Predictive Analytics (ASUM-DM) framework, where handcrafting is efficacious and useful. The use of ASUM-DM also increases probable concept drift anomaly detection; predictive cyber defensive applications in chaotic systems and projected prior dependencies from lifted inference for datatypes originating from non-stationary environments. In later portions of this research, we describe how this approach allows for more efficient neural network design while enabling more consistent predictive analytics methods, enhancing proactive cyber-defense capabilities using modified activation functions amenable to a range of neural network architectures as a reflection of the Bayesian Information Criterion’s (BIC) output. This preemptive observation of the BIC value during epochs of neural networks identifies potential computational intractability and significantly reduces energy expenditure. We finalize hypothesis testing, neural network parameterization, and fine-tuning and develop a comprehensive evaluation and description of the findings, conclusion, and recommended areas of future research.

Main file

Metadata Concept Shifts in Relational Machine Learning_03.26.23.docx