# Adabelief-Optimizer
**Repository Path**: frontxiang/Adabelief-Optimizer
## Basic Information
- **Project Name**: Adabelief-Optimizer
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: BSD-2-Clause
- **Default Branch**: update_0.2.0
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2023-11-22
- **Last Updated**: 2023-11-22
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
Adam and AdaBelief are summarized in Algo.1 and Algo.2, where all operations are
element-wise, with differences marked in blue. Note that no extra parameters are introduced in AdaBelief. For simplicity,
we omit the bias correction step. Specifically, in Adam, the update
direction is

, where

is the EMA (Exponential Moving Average) of

; in AdaBelief, the update direction is

,
where

is the of
^2)
. Intuitively, viewing

as the prediction of

, AdaBelief takes a
large step when observation

is close to prediction

, and a small step when the observation greatly deviates
from the prediction.
## Reproduce results in the paper
#### (Comparison with 8 other optimizers: SGD, Adam, AdaBound, RAdam, AdamW, Yogi, MSVAG, Fromage)
See folder ``PyTorch_Experiments``, for each subfolder, execute ```sh run.sh```. See ```readme.txt``` in each subfolder for visualization, or
refer to jupyter notebook for visualization.
### Results on Image Recognition