Characterization of threats in IoT from an MQTT protocol-oriented dataset

[EN] Nowadays, the cybersecurity of Internet of Thing (IoT) environments is a big challenge. The analysis of network traffic and the use of automated estimators built up with machine learning techniques have been useful in detecting intrusions in traditional networks. Since the IoT networks require...

Descripción completa

Detalles Bibliográficos
Autores: Muñoz Castañeda, Ángel Luis, Aveleira Mata, José Antonio, Alaiz Moretón, Héctor
Tipo de recurso: artículo
Estado:Versión publicada
Fecha de publicación:2023
País:España
Institución:Universidad Rey Juan Carlos
Repositorio:BULERIA. Repositorio Institucional de la Universidad de León
OAI Identifier:oai:buleria.unileon.es:10612/19281
Acceso en línea:https://link.springer.com/article/10.1007/s40747-023-01000-y
https://hdl.handle.net/10612/19281
Access Level:acceso abierto
Palabra clave:Informática
Ingeniería de sistemas
IoT
MQTT
Machine learning
Features selection
1203.12 Bancos de Datos
1203.17 Informática
1203.04 Inteligencia Artificial
Descripción
Sumario:[EN] Nowadays, the cybersecurity of Internet of Thing (IoT) environments is a big challenge. The analysis of network traffic and the use of automated estimators built up with machine learning techniques have been useful in detecting intrusions in traditional networks. Since the IoT networks require new and particular protocols to control the communications between the different devices involved in the networks, the knowledge acquired in the study of general networks may be unuseful some times. The goal of this paper is twofold. On the one hand, we aim to obtain a consistent dataset of the network traffic of an IoT system based on the Message Queue Telemetry Transport protocol (MQTT) and undergoing certain type of attacks. On the other hand, we want to characterize each of these attacks in terms of the minimum possible number of significant variables allowed by this protocol. Obtaining the data set has been achieved by studying the MQTT protocol in depth, while its characterization has been addressed through a hybrid (filter/wrapper) feature selection algorithm based on the idea behind the minimum-redundancy maximum-relevance (mRMR) algorithm. The dataset, together with the feature selection algorithm, carries out a characterization of the different attacks which is optimal in terms of the accuracy of the machine learning models trained on it as well as in terms of the capability of explaining their underlying nature. This confirms the consistency of the dataset