FedRand: Protecting Privacy in Federated Learning with Randomized LoRA Updates

Data Privacy in Federated Learning: FedRand and Randomized LoRA Parameter Updates

Federated Learning (FL) has emerged as a promising method for decentralized training of AI models. The central server does not directly access the data of individual clients but only aggregates the locally trained model parameters. However, this procedure still poses data privacy risks, as the models provided by the client potentially allow inferences about the underlying training data. This is especially true for Vision-Language Models (VLMs), which, due to their architecture, are vulnerable to Membership Inference Attacks (MIAs). In these attacks, an attacker attempts to determine whether specific data points were part of the training dataset.

A new approach to improving data privacy in FL is FedRand, a framework that avoids disclosing the complete parameter set of the clients. The core of FedRand is the use of Low-Rank Adaptation (LoRA), a technique for efficient adaptation of large language models. In contrast to conventional FL, where all LoRA parameters are shared with the server, each client in FedRand randomly selects a subset of the LoRA subparameters and trains these together with the remaining, private parameters on its local dataset. Subsequently, only the selected, non-private client parameters are transmitted to the server for aggregation.

Through this selective sharing of parameters, FedRand reduces the risk of disclosing client-side VLM parameters and thus increases data privacy. Empirical studies show that FedRand improves robustness against MIAs compared to existing approaches without significant losses in model accuracy. The achieved accuracy is comparable to methods that transmit the complete LoRA parameter set.

How FedRand Works in Detail

The process of FedRand can be divided into the following steps:

1. **Initialization:** The server initializes a global model and distributes the initial LoRA parameters to all participating clients.

2. **Local Parameter Selection:** Each client randomly selects a subset of the LoRA subparameters, which are considered non-private parameters. The remaining subparameters are considered private parameters and remain exclusively on the client.

3. **Local Training:** Each client trains both the selected non-private and the private LoRA parameters on its local dataset.

4. **Aggregation:** The clients send the trained non-private LoRA parameters to the server. The server aggregates these parameters to update the global model.

5. **Repetition:** Steps 2 to 4 are repeated over several rounds until the global model converges.

Advantages of FedRand

FedRand offers the following advantages over conventional FL methods:

Improved Data Privacy: The selective sharing of parameters reduces the risk of Membership Inference Attacks.

Efficient Training: The use of LoRA enables efficient training of large language models in the federated context.

Comparable Accuracy: FedRand achieves comparable model accuracy to methods that transmit the complete parameter set.

Future Research

Although FedRand delivers promising results, there are still open research questions that should be addressed in the future. These include the optimal selection of the randomly selected subparameters and the investigation of the effects of FedRand on different types of datasets and model architectures.

Conclusion

FedRand represents an important step towards more privacy-preserving federated learning. By combining LoRA and randomized parameter selection, FedRand offers an effective mechanism for protecting the privacy of clients without compromising model accuracy. This development is particularly relevant for the use of FL in sensitive areas such as healthcare or the financial sector, where the protection of personal data is of paramount importance.

```