Changes

Jump to: navigation, search
Results analysis
The following figure represents the local <code>training_loss</code> obtained running the quoted experiments. As can be seen, the loss of the centralized simulation isn’t good enough to keep up with the other experiments that reach a bit lower loss values. This shows the effectiveness of the FL algorithms compared to a classical ML approach. It can also be noticed the same behavior obtained in section with the previous test bed, where at the beginning of each round the loss go instantly higher compared to the last epoch of the previous round to get lower with later epochs. Another important thing to note is how experiments with an alpha value of 0.1 perform worse than their counterpart evaluated from an alpha value of 1.0.
 
[[File:NVFlare-local-training-loss.png|center|thumb|727x727px|NVFlare: local training loss.]]
This factor becomes even more evident when observing the following chart, which illustrates the server <code>validation_accuracy</code>. This is due to the fact that there are more unbalanced classes within each client’s dataset and this leads models trained on classes with less data to have difficulty generalizing correctly. Models are more inclined to predict dominant classes, reducing accuracy on less represented classes. The poor representation of some classes makes it difficult for models to learn from them, leading to lower overall accuracy.
Looking at the individual algorithms in more detail can be seen a very similar behaviour between the FedAvg and FedProx algorithms, which have very similar results in terms of both local training_loss and [[File:NVFlare-server validation_accuracy. This is mainly due to the fact that they are very similar to each other minus a proximity term mu, in the case of FedProx, which improves the convergence ratio. The Scaffold algorithm, on the other hand, has a totally different implementation from its predecessors, which allows to dynamic adjustment of the aggregation weight of each client’s update based on their historical performance, and thus achieves better performance, especially when using unbalanced classes (α = 0validation-accuracy.1). This can easily be seen in the png|center|thumb|732x732px|NVFlare: server validation_accuracy graph. The successful execution of this more complex use case on NVFlare, in- volving multiple tested algorithms and diverse data heterogeneity, further underscores the framework’s robust capabilities and suitability for a wide range of scenariosvalidation accuracy. This result confirms the versatility of NVFlare as a frame-]]
work for FL frameworkAnalyzing the individual algorithms, making it can be seen a reliable choice for real-world scenariosvery similar behavior between FedAvg and FedProx, which have very similar results in terms of both local <code>training_loss</code> and server <code>validation_accuracy</code>. This is mainly due to the fact that they are very similar to each other minus a proximity term mu, in the case of FedProx, which improves the convergence ratio. The Scaffold algorithm, on the other hand, has a totally different implementation from its predecessors, which allows to dynamic adjustment of the aggregation weight of each client’s update based on their historical performance, and thus achieves better performance, especially when using unbalanced classes (α = 0.1). This can easily be seen in the server <code>validation_accuracy</code> graph.
requiring The successful execution of this more complex use case on NVFlare, involving multiple tested algorithms and diverse data heterogeneity, further underscores the framework’s robust capabilities and suitability for a wide range of scenarios. This result confirms the versatility of NVFlare as a FL framework, making it a reliable choice for real-world scenarios that requirw the management of heterogeneous and complex data.
= Conclusions and future work =
4,650
edits

Navigation menu