The cost of training machine learning models over distributed data sources

Federated learning is one of the most appealing alternatives to the standard centralized learning paradigm, allowing a heterogeneous set of devices to train a machine learning model without sharing their raw data. However, it requires a central server to coordinate the learning process, thus introdu...

Descripción completa

Detalles Bibliográficos
Autores: Guerra, Elia, Wilhelmi Roca, Francesc, Miozzo, Marco, Dini, Paolo
Tipo de recurso: artículo
Estado:Versión publicada
Fecha de publicación:2023
País:España
Institución:Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)
Repositorio:Recercat. Dipósit de la Recerca de Catalunya
OAI Identifier:oai:dnet:recercat____::902691ad2f5720ab584811d983a0d6b8
Acceso en línea:https://hdl.handle.net/10230/73074
http://dx.doi.org/10.1109/OJCOMS.2023.3274394
Access Level:acceso abierto
Palabra clave:Blockchain
Decentralized learning
Edge computingk
Energy efficiency
Federated learning
Machine learning
Descripción
Sumario:Federated learning is one of the most appealing alternatives to the standard centralized learning paradigm, allowing a heterogeneous set of devices to train a machine learning model without sharing their raw data. However, it requires a central server to coordinate the learning process, thus introducing potential scalability and security issues. In the literature, server-less federated learning approaches like gossip federated learning and blockchain-enabled federated learning have been proposed to mitigate these issues. In this work, we propose a complete overview of these three techniques, proposing a comparison according to an integral set of performance indicators, including model accuracy, time complexity, communication overhead, convergence time, and energy consumption. An extensive simulation campaign permits to draw a quantitative analysis considering both feedforward and convolutional neural network models. Results show that gossip federated learning and standard federated solution are able to reach a similar level of accuracy, and their energy consumption is influenced by the machine learning model adopted, the software library, and the hardware used. Differently, blockchain-enabled federated learning represents a viable solution for implementing decentralized learning with a higher level of security, at the cost of an extra energy usage and data sharing. Finally, we identify open issues on the two decentralized federated learning implementations and provide insights on potential extensions and possible research directions on this new research field.