Conception & mise en place d’une plateforme multi modèles pour la détection des anomalies
Abstract #
The detection of anomalies, which has become one of the main objectives of companies, relies
on artificial intelligence, the latter offering proactive approaches makes it possible to identify
abnormal behaviors within the pool of collected data. These datasets contain a majority
(normal) class and the rest of the data, which does not have the same characteristics as that
of normal data, is reported as anomaly. Existing proactive approaches are generally based on
tagged data, and their performance depends mainly on the tagging effort.
Anomalies can generally result in problems depending on the sector studied, such as structural
defects, errors, intrusions or fraud.
In this context, this report is aimed at designing and implementing a proactive solution for the
unsupervised detection of anomalies for the benefit of the OCP Group, through the analysis
and processing of numerous, diversified and not labeled.
After having carried out a study and a needs analysis for a precise definition of the objectives,
this makes it possible to meet expectations with regard to the project. Then, we based
ourselves on the documentation which allowed us to explore the different methods of solving
the problem, namely Machine Learning algorithms, GAN model methods and methods based
on autoencoders.
Then, after having developed the Machine Learning models (OC-SVM, LOF, Isolation Forest
and KMeans), the GAAL methods and the DAGMM model, we carried out a comparative study
between them to finally extract the most efficient models, which will remain applicable to the
actual data of OCP Group entities and we will be able to detect future malicious threats that
attack the information assets.
Finally, we designed and implemented a configurable platform where we integrated the most
efficient models, which are the GAAL (MOGAAL and SOGAAL) and DAGMM methods with an
accuracy of 0.9 on certain datasets. With this platform, we are able to visualize our data,
perform descriptive analyzes and apply proactive approaches to unsupervised anomaly
detection.
Keywords: Anomalies, Deep learning, Generative antagonistic networks, active learning,
Autoencoders.