# torch_adabob

**Repository Path**: frontxiang/torch_adabob

## Basic Information

- **Project Name**: torch_adabob
- **Description**: Code for Pattern Recognition journal 2025  paper titled "Dynamic Bound Adaptive Gradient Methods with Belief in Observed Gradients" .
- **Primary Language**: Python
- **License**: BSD-3-Clause
- **Default Branch**: master
- **Homepage**: https://orcid.org/0000-0001-6810-8446
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-04-08
- **Last Updated**: 2025-05-27

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Pattern Recognition Paper: Dynamic Bound Adaptive Gradient Methods with Belief in Observed Gradients
Xiang Qian, Wang Xiaodan, Lei Lei and Song Yafei
## Abstract
In the realm of deep neural network training, both AdaBound and AdaBelief have emerged as significant advancements within adaptive gradient methods. AdaBelief, by incorporating curvature information through the belief in observed gradients, effectively addresses the issue of noisy gradients, thereby enhancing optimization stability. However, we found that AdaBelief exhibits sensitivity to abrupt gradient outliers due to over-smoothed second-moment estimation and lacks convergence guarantees in convex settings caused by non-monotonic stepsize sequences.
				To address these challenges, we introduce AdaBoB (AdaBound with Belief in Observed Gradients), an innovative optimization algorithm that integrates the adaptive learning rate bounds of AdaBound with the curvature-informed belief mechanism of AdaBelief. By combining these two approaches, AdaBoB leverages the strengths of both methods to achieve robust convergence and generalization. Specifically, the curvature-informed belief mechanism mitigates the impact of noisy gradients, while the dynamic bounds ensure that the stepsize remains within a theoretically sound range, thereby satisfying convergence conditions while suppressing outlier-induced instability.
				We provide a comprehensive theoretical analysis that establishes AdaBoB's convergence in both convex and non-convex optimization scenarios. Additionally, extensive comparative experiments demonstrate that AdaBoB outperforms state-of-the-art methods across various deep learning tasks. The code is released at https://gitee.com/frontxiang/torch_adabob.
![AdaBoB Stepsize Visualization](figs/fig_bounds.jpg)  
*Figure 1: Stepsize dynamics of AdaBoB vs. AdaBelief across tasks. AdaBoB maintains bounded stepsizes while leveraging gradient belief.*

## Introduction
AdaBoB is a novel optimization algorithm that synergizes **dynamic learning rate bounds** (from AdaBound) with **curvature-informed gradient belief** (from AdaBelief). Designed for deep learning tasks, it addresses two key limitations:
1. AdaBelief's sensitivity to gradient outliers due to over-smoothed second-moment estimation
2. Non-convergence risks in convex settings from non-monotonic stepsizes

## Key Features
- ✅ **Dynamic Bounding**: Ensures stepsizes stay within theoretically sound ranges
- 🔍 **Gradient Belief**: Uses `(g_t - m_t)^2` for curvature-aware adaptation
- 📈 **Convergence Guarantees**: Provable convergence for both convex and non-convex optimization
- ⚡ **Low Complexity**: Only requires subtraction operations for belief calculation

## Algorithm Comparison
| Method       | Convergence | Curvature Info | Outlier Robustness | Complexity |
|--------------|-------------|----------------|--------------------|------------|
| AdaBound     | Yes         | No             | Yes                | Low        |
| AdaBelief    | No          | Yes            | No                 | Low        |
| **AdaBoB**   | **Yes**     | **Yes**        | **Yes**            | **Low**    |

*Table 1: Comparison with state-of-the-art optimizers*

## Theoretical Guarantees
**Convex Case**:  
Regret bound of $O(\sqrt{T})$ with $T$ iterations

**Non-Convex Case**:  
Convergence rate $O(\frac{\log T}{\sqrt{T}})$

## Experimental Results
### Performance on Non-Convex Functions
![Non-Convex Optimization](figs/fig_emp.png)  
*Figure 2: AdaBoB escapes local minima and achieves global optimization in synthetic non-convex landscapes.*

### Benchmark Results
| Dataset      | AdaBoB | AdaBound | AdaBelief |
|--------------|--------|----------|-----------|
| CIFAR-10     | 95.45% | 95.23%   | 92.20%    |
| ImageNet     | 54.67% | 54.41%   | -         |
| HRRP Radar   | 97.11% | 97.00%   | 96.58%    |

*Table 2: Top-1 accuracy across different tasks*

## Usage
### Installation
```python
from ljp.optimizers.adabob import AdaBoB

optimizer = AdaBoB(model.parameters(), 
                 lr=0.001, 
                 betas=(0.9, 0.999),
                 final_lr=0.1,
                 gamma=1e-3)

for input, target in dataset:
    loss = model(input)
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()
```
## Citations
**IF YOU USE THIS CODE, PLEASE CITE**: Qian Xiang,  Xiaodan Wang, Lei Lei, and Yafei Song. 2025. 'Dynamic bound adaptive gradient methods with belief in observed gradients', Pattern Recognition, 168: 111819. https://doi.org/10.1016/j.patcog.2025.111819
```plaintext
@article{xiang2025adabob,
  title={Dynamic Bound Adaptive Gradient Methods with Belief in Observed Gradients},
  author={Xiang, Qian and Wang, Xiaodan and Song, Yafei and Lei, Lei},
  journal={Pattern Recognition},
  year={2025},
  volume = {168},
  doi={10.1016/j.patcog.2025.111819}
}
```
### Other related papers:
-  **Qian Xiang**, Xiaodan Wang, Yafei Song, and Lei Lei. 2025. ISONet: Reforming 1DCNN for aero-engine system inter-shaft bearing fault diagnosis via input spatial over-parameterization. Expert Systems with Applications, 277, 127248. https://doi.org/10.1016/j.eswa.2025.127248 Code: https://gitee.com/frontxiang/torch_isonet
-  **Qian Xiang**, Xiaodan Wang, Jie Lai, Lei Lei, Yafei Song, Jiaxing He, and Rui Li. 2024. 'Quadruplet depth-wise separable fusion convolution neural network for ballistic target recognition with limited samples', Expert Systems with Applications, 235: 121182. https://doi.org/10.1016/j.eswa.2023.121182
-  **Qian Xiang**, Xiaodan Wang, Yafei Song, Lei Lei, Rui Li, and Jie Lai. 2021. 'One-dimensional convolutional neural networks for high-resolution range profile recognition via adaptively feature recalibrating and automatically channel pruning', International Journal of Intelligent Systems, 36: 332-61. https://onlinelibrary.wiley.com/doi/abs/10.1002/int.22302
-  **Qian Xiang**, Xiaodan Wang, Jie Lai, Yafei Song, Rui Li, and Lei Lei. 2023. 'Group-Fusion One-Dimensional Convolutional Neural Network for Ballistic Target High-Resolution Range Profile Recognition with Layer-Wise Auxiliary Classifiers', International Journal of Computational Intelligence Systems, 16: 190. https://doi.org/10.1007/s44196-023-00372-w
-  **Qian Xiang**, Xiaodan Wang, Jie Lai, Yafei Song, Rui Li, and Lei Lei. 2022. 'Multi-scale group-fusion convolutional neural network for high-resolution range profile target recognition', Iet Radar Sonar and Navigation, 16: 1997-2016. https://doi.org/10.1049/rsn2.12312
-  **Qian Xiang**, Xiaodan Wang, Xuan Wu, Jie Lai, Jiaxing He, and Yafei Song. 2023. "CsiTransformer: A Limited-sample 6G Channel State Information Feedback Model." In 2023 IEEE 6th International Conference on Pattern Recognition and Artificial Intelligence (PRAI 2023), 1160-66.   https://doi.org/10.1109/PRAI59366.2023.10331944
-  **Qian Xiang**, Xiaodan Wang, Jie Lai, Yafei Song, Jiaxing He, and Lei Lei. 2022. "5G Network Reference Signal Receiving Power Prediction Based on Multilayer Perceptron." In 2022 China Automation Congress (CAC 2022), 19-24.   https://doi.org/10.1109/CAC57257.2022.10055904.