Conception & mise en place d’une plateforme multi modèles pour la détection des anomalies

Abstract #

The detection of anomalies, which has become one of the main objectives of companies, relies

on artificial intelligence, the latter offering proactive approaches makes it possible to identify

abnormal behaviors within the pool of collected data. These datasets contain a majority

(normal) class and the rest of the data, which does not have the same characteristics as that

of normal data, is reported as anomaly. Existing proactive approaches are generally based on

tagged data, and their performance depends mainly on the tagging effort.

Anomalies can generally result in problems depending on the sector studied, such as structural

defects, errors, intrusions or fraud.

In this context, this report is aimed at designing and implementing a proactive solution for the

unsupervised detection of anomalies for the benefit of the OCP Group, through the analysis

and processing of numerous, diversified and not labeled.

After having carried out a study and a needs analysis for a precise definition of the objectives,

this makes it possible to meet expectations with regard to the project. Then, we based

ourselves on the documentation which allowed us to explore the different methods of solving

the problem, namely Machine Learning algorithms, GAN model methods and methods based

on autoencoders.

Then, after having developed the Machine Learning models (OC-SVM, LOF, Isolation Forest

and KMeans), the GAAL methods and the DAGMM model, we carried out a comparative study

between them to finally extract the most efficient models, which will remain applicable to the

actual data of OCP Group entities and we will be able to detect future malicious threats that

attack the information assets.

Finally, we designed and implemented a configurable platform where we integrated the most

efficient models, which are the GAAL (MOGAAL and SOGAAL) and DAGMM methods with an

accuracy of 0.9 on certain datasets. With this platform, we are able to visualize our data,

perform descriptive analyzes and apply proactive approaches to unsupervised anomaly

detection.

Keywords: Anomalies, Deep learning, Generative antagonistic networks, active learning,

Autoencoders.