At IFL 2017 in Bristol, UK, I presented our work-in-progress paper “Purely Functional Federated Learning in Erlang.” An update to this work will be presented as a poster at the upcoming 5th Swedish Workshop on Data Science (SweDS 2017), which will take place from 12 to 13 December 2017 in Gothenburg, Sweden. It is hosted by the University of Gothenburg. The abstract is reproduced below.
Functional Federated Learning in Erlang
Authors: G. Ulm, E. Gustavsson, and M. Jirstrand
A modern connected car produces gigabytes to terabytes of data per day. Collecting data generated by an entire fleet of cars, and processing it centrally on a server farm, is thus not feasible. The problem is that the total amount of data generated by cars, i.e. on edge devices, is too large to be efficiently transmitted to a central server. However, CPUs used in edge devices such as connected cars but also regular smartphones that connect to the cloud, have been getting more and more powerful in recent years. Tapping into this computational resource is one way of addressing the problem of processing big data that is generated by large numbers of edge devices.
One such approach consists of distributed data processing. Using the example of training an Artificial Neural Network, we introduce a framework for distributed data processing. A particular focus is on the implementation language Erlang. Arguably the biggest strength of the functional programming language Erlang is how straightforward it is to implement concurrent and distributed programs with it. Numerical computing, on the other hand, is not necessarily seen as one of its strengths.
The recent introduction of Federated Learning, a concept according to which edge devices are leveraged for decentralized machine learning tasks, while a central server only updates and distributes a global model, provides the motivation for exploring how well Erlang is suited to such a use case. We present a framework for Federated Learning in Erlang, written in a purely functional style. Erlang is used for coordinating data processing tasks but also for performing numerical computations. Initial results show that Erlang is well-suited for that kind of task.
We provide an overview of the general framework and also discuss an existing and fully realized in-house prototypical implementation that performs distributed machine learning tasks according to the Federated Learning paradigm. While we focus on Artificial Neural Networks, our Federated Learning framework is of a more general nature and could also be used with other machine learning algorithms.
The novelty of our work is that we present the first publicly available implementation of a Federated Learning framework; our work is also the first implementation of Federated Learning in a functional programming language, with the added benefit of being purely functional. In addition, we demonstrate that Erlang can not only be leveraged for message passing but that it also performs adequately for practical machine learning tasks.
Our presentation is based on our work-in-progress paper “Purely Functional Federated Learning in Erlang”, which we presented at IFL 2017. The context of this research is our ongoing involvement in the Vinnova-funded project "On-board/off-board distributed data analysis" (OODIDA), which is a joint-project between the Fraunhofer-Chalmers Research Centre for Industrial Mathematics, Chalmers University of Technology, Volvo Car Corporation, Volvo Trucks, and Alkit Communications.