正文
(i) Opprentice approaches the above problem through
supervised machine learning
.
(ii) Features of the data are the results of the detectors.(Basic Detectors 来计算出特征)
(iii) The labels of the data are from operators’ experience.(人工打标签)
(iv)
Addressing Challenges in Machine Learning
: (机器学习遇到的挑战)
(1) Label Overhead: Opprentice has a dedicated labeling tool with a simple and convenient interaction interface. (标签的获取)
(2) Incomplete Anomaly Cases:(异常情况的不完全信息)
(3) Class Imbalance Problem: (正负样本比例不均衡)
(4) Irrelevant and Redundant Features:(无关和多余的特征)
4. Opprentice’s Design:
Architecture:
Operators label the data and numerous detectors functions are feature extractors for the data.
Label Tool:
人工使用鼠标和软件进行标注工作
Detectors:
(i) Detectors As Feature Extractors:
(Detector用来提取特征)
Here for each parameter detector, we sample their parameters so that we can obtain several fixed detectors, and a detector with specific sampled parameters a (detector) configuration. Thus a configuration acts as a
feature extractor
:
data point + configuration (detector + sample parameters) -> feature,
(ii) Choosing Detectors:
(Detector的选择,目前有14种较为常见的)
Opprentice can find suitable ones from broadly selected detectors, and achieve a relatively high accuracy. Here, we implement 14 widely-used detectors in Opprentice.
Opprentice has 14 widely-used detectors:
“
Diff
“: it simply measures anomaly severity using the differences between the current point and the point of last slot, the point of last day, and the point of last week.
“
MA of diff