# 基于梯度补偿的高效联邦学习框架 Overlap-FedAvg - 四川大学

**Repository Path**: codingzhe/Overlap-FedAvg

## Basic Information

- **Project Name**: 基于梯度补偿的高效联邦学习框架 Overlap-FedAvg - 四川大学
- **Description**: 四川大学科研项目 "Communication-Efficient Federated Learning with Compensated Overlap-FedAvg" 官方代码实现，发表在 TPDS 2021 上
- **Primary Language**: Python
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 3
- **Created**: 2022-04-20
- **Last Updated**: 2022-04-20

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

[![DOI](https://zenodo.org/badge/320731198.svg)](https://zenodo.org/badge/latestdoi/320731198)
# 基于梯度补偿的高效联邦学习框架 Overlap-FedAvg - 四川大学
这是四川大学科研项目 "Communication-Efficient Federated Learning with Compensated Overlap-FedAvg" 官方代码实现，发表在 TPDS 2021 上。

联邦学习由于每个 Epoch 的同步数据与模型的大小相同，引入了大量的通信开销，从而导致通信效率低下。为此，我们提出了一个将模型训练阶段与模型上传、下载阶段并行的框架 Overlap-FedAvg，使模型上传、下载阶段可以完全覆盖模型训练阶段。与普通的 FedAvg 相比，Overlap-FedAvg 进一步发展了分层计算策略、数据补偿机制和 Nesterov 梯度加速算法。此外，Overlap-FedAvg 与其他许多压缩方法是正交的，可以将它们一起应用，最大限度地利用集群。

### 论文摘要

> While petabytes of data are generated each day by a number of independent computing devices, only a few of them can be finally collected and used for deep learning (DL) due to the apprehension of data security and privacy leakage, thus seriously retarding the extension of DL. In such a circumstance, federated learning (FL) was proposed to perform model training by multiple clients' combined data without the dataset sharing within the cluster. Nevertheless, federated learning with periodic model averaging (FedAvg) introduced massive communication overhead as the synchronized data in each iteration is about the same size as the model, and thereby leading to a low communication efficiency. Consequently, variant proposals focusing on the communication rounds reduction and data compression were proposed to decrease the communication overhead of FL. In this paper, we propose Overlap-FedAvg, an innovative framework that loosed the chain-like constraint of federated learning and paralleled the model training phase with the model communication phase (i.e., uploading local models and downloading the global model), so that the latter phase could be totally covered by the former phase. Compared to vanilla FedAvg, Overlap-FedAvg was further developed with a hierarchical computing strategy, a data compensation mechanism, and a nesterov accelerated gradients (NAG) algorithm. In Particular, Overlap-FedAvg is orthogonal to many other compression methods so that they could be applied together to maximize the utilization of the cluster. Besides, the theoretical analysis is provided to prove the convergence of the proposed framework. Extensive experiments conducting on both image classification and natural language processing tasks with multiple models and datasets also demonstrate that the proposed framework substantially reduced the communication overhead and boosted the federated learning process.

## 快速上手

### 克隆仓库

```
git clone https://github.com/Soptq/Overlap-FedAvg
cd Overlap-FedAvg
```

### 安装环境

```
pip install -r requirements.txt
```

### 准备数据集

```
python prepare_data.py
```

### 运行模拟实验

在这里，为了简化演示过程，我们提供的 Overlap-FedAvg 脚本是在五卡单机环境下模拟了联邦学习的分布式环境。5 张卡中 GPU:0 是主节点，GPU:1 到 GPU:4 是子节点。但是需要注意的是，虽然在单机下模拟实验会让环境配置工作变得简单很多，但同时也会显著降低网络的通讯速度，从而使得集群的通讯效率变相增高，给实验结果带来偏差。所以，我们推荐有条件的话在真实的联邦学习环境下进行实验以测试该方法的有效性。

朴素 FedAvg 方法可以通过以下代码运行:

```
python FedAvg.py --world_size 5 --lr 0.001 --batch_size 32 --max_epoch 10 --client_epoch 5 --dataset mnist --model mlp --gpus 0,1,2,3,4
```

相似的，Overlap-FedAvg 可以通过以下代码运行:

```
python AsyncFedAvg.py --world_size 5 --lr 0.001 --batch_size 32 --max_epoch 10 --client_epoch 5 --dataset mnist --model mlp --gpus 0,1,2,3,4
```

## 引用

```
@article{zhou2020communication,
  title={Communication-Efficient Federated Learning with Compensated Overlap-FedAvg},
  author={Zhou, Yuhao and Qing, Ye and Lv, Jiancheng},
  journal={IEEE transactions on parallel and distributed systems},
  publisher={IEEE}
}
```

## 开源协议

```
MIT License

Copyright (c) 2020 Soptq

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
```