On the Development of Robust Anomaly Detection Algorithms with Limited Labeled Data
Anomaly Detectors are used in a number of multivariate applications to identify data that do not belong within a dataset. A significant problem in the development of anomaly detection algorithms is the insufficient availability of truth- labeled data. This can lead to instances where the algorithms might not generalize to real-world operation scenarios. This research focuses on the sensitivity of anomaly detection algorithms to the anomalies present. First, a method to gradually modify the multivariate characteristics of the anomalies to approach the structure of the background is developed. In doing so, algorithms are tested with the expectation that gradual degradation in performance will occur as the anomalies become more difficult to differentiate from the background. This allows for a comparison between algorithms with both graphical and numerical techniques. The anomaly adaptation approach is extended into a robust parameter design framework to identify robust algorithm parameters that take into account the many possible anomalies present in extended operation scenarios. Additionally, a novel method for the selection of anomaly thresholds from possibly anomaly-contaminated datasets is presented. This method is shown to have superior performance on both simulated and real-world data. The method is also used to create new versions of a number of popular anomaly detectors in Hyperspectral Images.