# Federated-Averaging-PyTorch

**Repository Path**: frontxiang/Federated-Averaging-PyTorch

## Basic Information

- **Project Name**: Federated-Averaging-PyTorch
- **Description**: No description available
- **Primary Language**: Python
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-10-04
- **Last Updated**: 2021-10-04

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Federated Averaging (FedAvg) in PyTorch [![arXiv](https://img.shields.io/badge/arXiv-1602.05629-f9f107.svg)](https://arxiv.org/abs/1602.05629)

An unofficial implementation of `FederatedAveraging` (or `FedAvg`) algorithm proposed in the paper [Communication-Efficient Learning of Deep Networks from Decentralized Data](https://arxiv.org/abs/1602.05629) in PyTorch. (implemented in Python 3.9.2.)

## Implementation points
* Exactly implement the models ('2NN' and 'CNN' mentioned in the paper) to have the same number of parameters written in the paper.
  * 2NN: `TwoNN` class in `models.py`; 199,210 parameters
  * CNN: `CNN` class in `models.py`; 1,663,370 parameters
* Exactly implement the non-IID data split.
  * Each client has at least two digits in case of using `MNIST` dataset.
* Implement multiprocessing of _client update_ and _client evaluation_.
* Support TensorBoard for log tracking.

## Requirements
* See `requirements.txt`

## Configurations
* See `config.yaml`

## Run
* `python3 main.py`

## Results
### MNIST
* Number of clients: 100 (K = 100)
* Fraction of sampled clients: 0.1 (C = 0.1)
* Number of rounds: 500 (R = 500)
* Number of local epochs: 10 (E = 10)
* Batch size: 10 (B = 10)
* Optimizer: `torch.optim.SGD`
* Criterion: `torch.nn.CrossEntropyLoss`
* Learning rate: 0.01
* Momentum: 0.9
* Initialization: Xavier

Table 1. Final accuracy and the best accuracy 
| Model     | Final Accuracy(IID) (Round) | Best Accuracy(IID) (Round) | Final Accuracy(non-IID) (Round) | Best Accuracy(non-IID) (Round) |
| -----     | -----                       | ----                       | ----                            | ----                           |
| 2NN       | 98.38% (500)                | 98.45% (483)               | 97.50% (500)                    | 97.65% (475)                   |
| CNN       | 99.31% (500)                | 99.34% (197)               | 98.73% (500)                    | 99.28% (493)                   |

Table 2. Final loss and the least loss 
| Model     | Final Loss(IID) (Round) | Least Loss(IID) (Round) | Final Loss(non-IID) (Round) | Least Loss(non-IID) (Round) |
| -----     | -----                   | ----                    | ----                        | ----                        |
| 2NN       | 0.09296 (500)           | 0.06956 (107)           | 0.09075 (500)               | 0.08257 (475)               |
| CNN       | 0.04781 (500)           | 0.02497 (86)            | 0.04533 (500)               | 0.02413 (366)               |

Figure 1. MNIST 2NN model accuracy (IID: top / non-IID: bottom)
![iidmnist](https://user-images.githubusercontent.com/33894768/117546686-95b8c880-b066-11eb-817c-e878a338d28e.png)
![run-Accuracy_ MNIST _TwoNN C_0 1, E_10, B_10, IID_False-tag-Accuracy](https://user-images.githubusercontent.com/33894768/117534148-34bfcf00-b02b-11eb-9b2d-f9a33d05242e.png)

Figure 2. MNIST CNN model accuracy (IID: top / non-IID: bottom)
![run-Accuracy_ MNIST _CNN C_0 1, E_10, B_10, IID_True-tag-Accuracy](https://user-images.githubusercontent.com/33894768/117534156-3b4e4680-b02b-11eb-9f27-ce4a10e7cd6b.png)
![Accuracy](https://user-images.githubusercontent.com/33894768/117542232-c2fb7b80-b052-11eb-90c6-725c94fe0109.png)


## TODO
- [ ] Do CIFAR experiment (CIFAR10 dataset) & large-scale LSTM experiment (Shakespeare dataset)
- [ ] Learning rate scheduling
- [ ] More experiments with other hyperparameter settings (e.g., different combinations of B, E, K, and C)