Open main menu

DAVE Developer's Wiki β

Changes

m
no edit summary
== Data heterogeneity ==
In this advanced project, an additional feature was incorporated involving the integration of classes aimed at performing dataset splitting among the designated clients, which, in this instance, were four in number. In addition to dividing the dataset into four subsets, the possibility of choosing the level of heterogeneity of the data was added by applying the Dirichlet sampling strategy. Thus, it was possible to dynamically adjust the degree of data heterogeneity for each client bringing higher. This functionality made it possible to simultaneously customize the level of data heterogeneity across all clients. In the context of FL, this data heterogeneity can be defined as follows:
* '''Low Data Heterogeneity''': Low heterogeneity means that the data across different clients is quite similar or homogeneous. There is little variation among the data held by different clients. This leads to nearly balanced classes among clients, that is classes with a similar number of samples in each class.* '''High Data Heterogeneity''': High heterogeneity means that there is significant diversity in the data across different clients or nodes. This means that every subset assigned to each client contains unbalanced classes, i.e. some classes may be over-represented in some customers, while others may be under-represented.
In order to have a clear comparison within the experiments, the upper and lower extremes of the α factor affecting heterogeneity were considered, i.e. 0.1 and 1.0.
a000298_approval, dave_user
180
edits