Difference between revisions of "ML-TN-007 — AI at the edge: exploring Federated Learning solutions"

From DAVE Developer's Wiki
Jump to: navigation, search
(Conclusions)
(Conclusions and future work)
Line 72: Line 72:
 
TBD
 
TBD
  
One important issue that was not addressed yet is the labeling of new samples. In other words, it was implicitily assumed that new samples collected by a device are somehow labelled prior to being used for training. This is a strong assumption because
+
One important issue that was not addressed yet is the labeling of new samples. In other words, it was implicitly assumed that new samples collected by a device are somehow labelled prior to being used for training. This is a strong assumption because implies that
  
 
labeling of new samples issue
 
labeling of new samples issue

Revision as of 08:58, 12 September 2023

Info Box
NeuralNetwork.png Applies to Machine Learning



History[edit | edit source]

Version Date Notes
1.0.0 August 2023 First public release

Introduction[edit | edit source]

According to Wikipedia, Federated Learning (FL) is defined as a machine learning technique that trains an algorithm via multiple independent sessions, each using its own dataset. This approach stands in contrast to traditional centralized machine learning techniques where local datasets are merged into one training session, as well as to approaches that assume that local data samples are identically distributed.

Federated learning enables multiple actors to build a common, robust machine learning model without sharing data, thus addressing critical issues such as data privacy, data security, data access rights and access to heterogeneous data. Its applications engage industries including defense, telecommunications, Internet of Things, and pharmaceuticals. A major open question is when/whether federated learning is preferable to pooled data learning. Another open question concerns the trustworthiness of the devices and the impact of malicious actors on the learned model.

In principle, FL can be an extremely useful technique to address critical issues of industrial IoT (IIoT) applications. As such, it matches perfectly DAVE Embedded Systems' IIoT platform, ToloMEO. This Technical Note (TN) illustrates how DAVE Embedded Systems explored, tested, and characterized some of the most promising open-source FL frameworks available to date. One of these frameworks might equip ToloMEO-compliant products in the future allowing our customers to implement federated learning systems easily. From the point of view of machine learning, therefore, we investigated if typical embedded architectures used today for industrial applications are suited for acting not only as inference platforms — we already dealt with this issue here — but as training platforms as well.

In brief, the work consisted of the following steps:

  • Selecting the FL frameworks to test.
  • Testing the selected frameworks.
  • Comparing the results for isolating the best framework.
  • Deep investigation of the best framework.

A detailed dissertation of the work that led to this Technical Note is available here TBD.

Choosing Federated learning frameworks[edit | edit source]

When we chose which frameworks to test, we set some requirements:

  • open-source, permissive licensing
  • TBD

Testing the selected frameworks[edit | edit source]

Flower[edit | edit source]

Flower is

Flower running on SBC ORCA
# of cores
1
Flower 1-core htop MX8M+.png
Flower log 1-core MX8M+.png
4
Flower 4-cpu htop MX8M+.png
Flower log 4-core MX8M+.png

NVFlare[edit | edit source]

NVFlare is

TBD

Comparing test results[edit | edit source]

TBD

Deep investigation of NVFlare[edit | edit source]

TBD

Conclusions and future work[edit | edit source]

TBD

One important issue that was not addressed yet is the labeling of new samples. In other words, it was implicitly assumed that new samples collected by a device are somehow labelled prior to being used for training. This is a strong assumption because implies that

labeling of new samples issue