Description
Anomaly is a deviation from a normal behavior. Anomaly detection techniques are used to detect unusual patterns in data.
These patterns deviate from the spectrum of normal behaviors in the data, and typically they represent critical events
that occurred in the monitored system. For example, in Cyber security, anomaly detection can be used to identify
sophisticated and targeted attacks like Advanced Persistent Threats (APT), where standard security systems often fail to
detect.
Our talk at CyberSec 2012: Anomaly Detection: the hunt for the holy grail
The anomaly detection problem, in its most general form, is not easy to solve. In fact, most of the existing anomaly
detection techniques solve a specific formulation (instance) of the problem.
The formulation is induced by various factors such as the nature of the data, availability of labeled data, type of
anomalies to be detected, etc. Often, the application domain in which the anomalies have to be detected determines
these factors.
Usually, in addition to the challenge of detecting anomalies in a dataset, the analyzed data is also high dimensional,
which makes it more difficult to analyze and interpret. For example, email traffic can be represented by thousands of
textual and numeric features.
Businesses in all sectors (military, intelligence, governmental, industrial etc.) can benefit from anomaly detection.
Data collected and stored in databases and warehouses is data that represents some real world processes.
Anomalies and outliers, which exist in the real world processes, will be captured with the collected data.
The application of the appropriate technique to identify and detect these anomalies can lead to new knowledge about the
data and hence the real world process.
Anomaly detection is applicable in a variety of domains, such as intrusion detection, fraud detection, fault detection, system health monitoring, event detection in sensor networks, and detecting eco-system disturbances.
Course Structure
In the first part of this course, we will introduce the concept of anomalies, provide motivation for anomaly detection and explore several real-world use cases of anomalies. We will overview different data types, including high-dimensional data, and provide methods for pre-processing of data. We will explore the different categories of anomaly detection and the different types of anomalies. We will conclude this section with methods for evaluation of anomaly detection methods.
In the second part of this course, we will survey different techniques for anomaly detection. For example, classification based techniques, nearest neighbors based techniques, clustering based techniques, statistical based techniques, spectral based techniques, visualization based techniques and more. For each technique, we will explore the theory behind it, its different categories, the pros and cons, demos and practice.
In the third part of the course, we will introduce and explore applications of anomaly detection. For example, Cyber security, fraud detection, performance monitoring, medical diagnostics and more.