Developing and Validating Statistical Cyber Defenses
The development and validation of advanced cyber security technology frequently relies on data that captures normal and suspicious activities at various system layers.
Enterprise business processes are more connected than ever before, driven by the ability to share the right information with the right partners at the right time. While this interconnectedness and situational awareness is crucial to success, it also opens the possibility for misuse of the same capabilities by sophisticated adversaries to spread attacks and corrupt critical, sensitive information. This is particularly true for an insider threat scenario in which adversaries have legitimate access to some resources and unauthorized access to other resources that is not directly controlled by a fine-grained policy.
The figure shows a high-level diagram of the processing flow in BBAC, together with the various data sets involved. As shown on the bottom, BBAC needs to ingest a large variety of data from real-time feeds through a feature extraction process. During online use, this data will be used for classification purposes. After parsing the raw observables, BBAC proceeds to go into a feature enrichment phase, where aggregate statistics are computed and information from multiple feeds is merged into a consistent representation. At this stage, BBAC needs to manage intermediate state required for more complex enrichment functions, e.g., calculating periodicity of events.
BBAC is a data-intensive system with successful execution hinging on (a) access to a large amount of external data, and (b) efficient management of internal data. Specifically, meaningful data sets are needed to develop and validate the accuracy, precision, and latency overhead of the BBAC algorithms and prototypes. BBAC’s analysis techniques work best with data that has a rich context and feature space. What is needed is a large amount of granular data to do statistical inference. Getting access to more granular information generally means installing software on end-systems or even recompiling applications (to map memory regions etc.), both of which raise practical concerns. To address granularity issues, BBAC focuses its analysis on data that is easily observable without new software or modifying end systems.
Since BBAC performs analysis at multiple different system layers, it not only needs access to data from sensors at these layers, but the data in each layer needs to be linked to the other layers to represent a consistent picture of observables. To address the problem of independence between data sets, BBAC uses an approach for injecting malicious URLs into request streams of benign hosts. Known bad HTTP requests are retrieved from black-lists, and intelligently inserted into existing connection patterns. It is important to keep the ratio of normal vs. abnormal traffic roughly equal allowing the resulting classifier to make decisions both on known proper behavior as well as known improper behavior.
Development and validation of statistical cyber defenses needs a well-labeled, appropriately sized, and readily available amount of relevant data to make innovative progress, yet too little of such data sets is available today. Agile project management techniques help deliver innovative technology in a difficult-to-work-in, data-intensive environment.
This work was done by Michael Jay Mayhew of the Air Force Research Laboratory, Michael Atighetchi and Aaron Adler of Raytheon BBN Technologies, and Rachel Greenstadt of Drexel University. AFRL-0231
Top Stories
INSIDERSoftware
The Future of Aerospace: Embracing Digital Transformation and Emerging...
INSIDERMaterials
Clean Sky Demonstrator Fuselage Shows Potential of Thermoplastics in Aircraft...
INSIDERTest & Measurement
Blue Origin Rocket Reaches Intended Orbit on First Launch
NewsAutomotive
AVSC Develops Best Practices for Traceable AV Safety Inspection Protocols
INSIDERRF & Microwave Electronics
First F-15Es Equipped With EPAWSS Ready for Flight
NewsPower
Webcasts
Software
Navigating Security in Automotive SoCs: How to Build Resilient...
Propulsion
Is Hydrogen Propulsion Production-Ready?
AR/AI
AI-Powered Quality Control for Sustainable Automotive Production
Aerospace
Improving Thermal Management for Aerospace and Defense Electronics
Connectivity
The Road Ahead for Next-Gen E/E Architectures: Trends and...
Software
Department of Defense Contracts Denied: New Cybersecurity Rules...