Skip to main content

Intelligent cyber-security data lake: Construction d'un système de centralisation et de surveillance des logs hétérogènes et détection d'anomalies sur la base de modèles de deep learning

Engineer: Youssra SAADEDDINE
Organisation: Atlas Cloud Services
Language: French
Promotion: 2021
Year: 3

Abstract #

In an IT infrastructure, applications, network devices, operating systems and any programmable

or intelligent device generate thousands of logs daily. Thanks to the analysis of these logs,

malicious attacks, intruders and security vulnerabilities can be detected.

This report summarizes the fruit of our work which aimed to build a cybersecurity Data Lake that

can centralize and manage large amounts of logs with different formats based on the ELK Stack

solution.

We used filebeat and logstash for log collection and preprocessing, elasticsearch for indexing and

log storage, Kibana for descriptive analysis and creation of dashboards, so we generated security

alerts using logstash.

With regard to the detection of anomalies we proposed an «AElog» model which is based on deep

learning techniques, more precisely the trasformers and the CNNs. The results show the reliability

of our model by providing predictions with attenuating performance levels over 99%.

Keywords: log management, log analysis, data lake, ELK Stack, deep learning, NLP, anomaly

detection, BERT, CNN