diff --git a/RELEASE.md b/RELEASE.md
index a05197ced1fcfc8ca5399b087d034818b7c4a89d..3e205d319523f4205699891e16f23d882fbc6c27 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -2,43 +2,43 @@
## DeepSparkHub 25.03 Release Notes
-### 特性和增强
+### 模型与算法
-#### 模型与算法
-● 新增了9个大模型训练示例,涉及MoE-LLaVA,DeepSpeed和LLaMA-Factory工具箱
+* 新增了9个大模型训练示例,涉及DeepSpeed,MoE-LLaVA和LLaMA-Factory工具箱
| 大模型 |
- | MoE-LLaVA-Phi2-2.7B(MoE-LLaVA) |
- MoE-LLaVA-Qwen-1.8B(MoE-LLaVA) |
- MoE-LLaVA-StableLM-1.6B(MoE-LLaVA) |
+ GLM-4 |
+ MiniCPM(DeepSpeed) |
+ Phi-3 |
- | Yi_6B(DeepSpeed) |
- Yi-1.5_6B(DeepSpeed) |
- Yi-VL-6B(LLaMA-Factory) |
+ MoE-LLaVA-Phi2-2.7B |
+ MoE-LLaVA-Qwen-1.8B |
+ MoE-LLaVA-StableLM-1.6B |
- | GLM-4 |
- MiniCPM(DeepSpeed) |
- Phi-3 |
+ Yi-6B (DeepSpeed) |
+ Yi-1.5-6B (DeepSpeed) |
+ Yi-VL-6B (LLaMA-Factory) |
-● 更新了cv/multi_object_tracking、cv/gnn、cv/face_recognition等分类名称。
-● 调整了kan、graph wavenet、hashnerf等模型的分类路径。
-● 删除了convnext、co-detr、centernet等模型的冗余代码与社区版本对齐。
-● 更新了相关模型README说明,增加了模型所支持的IXUCA SDK版本。
-● 更新了ATSS、Cascade R-CNN、CornerNet等模型代码,适配了MMDetection社区v3.3.0版本。
-● 增加了cv/classification、cv/detection自动化ci脚本。
-● 同步了tacotron2模型的代码。
+### 问题修复
+
+* 同步了Tacotron2 PyTorch模型的最新代码。
+* 删除了ConvNeXt,Co-DETR和CenterNet等模型的冗余代码,并对齐社区版本。
+* 更新了MMDetection工具箱版本至v3.3.0,并同步ATSS、Cascade R-CNN、CornerNet等模型代码。
+* 增加了cv/classification和cv/detection的自动化CI脚本。
+* 更新了所有模型README文档格式,补充了模型所支持的IXUCA SDK版本。
+
+### 版本关联
-#### 版本关联
DeepSparkHub 25.03对应天数软件栈4.2.0版本。
-#### 贡献者
+### 贡献者
感谢以下社区贡献者
@@ -46,7 +46,6 @@ DeepSparkHub 25.03对应天数软件栈4.2.0版本。
欢迎以任何形式为DeepSparkHub项目贡献。
-
## DeepSparkHub 24.12 Release Notes
### 特性和增强
diff --git a/reinforcement_learning/q-learning-networks/dqn/paddlepaddle/README.md b/reinforcement_learning/q-learning-networks/dqn/paddlepaddle/README.md
index d9423ebcda369233e6a4f73680aae81eed907bf9..f8b45af44ff62b3f687c2a314d497f55cb714b04 100644
--- a/reinforcement_learning/q-learning-networks/dqn/paddlepaddle/README.md
+++ b/reinforcement_learning/q-learning-networks/dqn/paddlepaddle/README.md
@@ -1,11 +1,16 @@
# DQN
-## Model description
+## Model Description
-The classic DQN algorithm in reinforcement learning is a value-based rather than a policy-based method. DQN does not
-learn a policy, but a critic. Critic does not directly take action, but evaluates the quality of the action.
+DQN (Deep Q-Network) is a foundational reinforcement learning algorithm that combines Q-Learning with deep neural
+networks. As a value-based method, it uses a critic network to estimate action quality in high-dimensional state spaces.
+DQN introduces experience replay and target network stabilization to enable stable training. This approach
+revolutionized AI capabilities in complex environments, achieving human-level performance in Atari games and forming the
+basis for advanced decision-making systems in robotics and game AI.
-## Step 1: Installation
+## Model Preparation
+
+### Install Dependencies
```bash
git clone https://github.com/PaddlePaddle/PARL.git
@@ -15,29 +20,26 @@ pip3 install matplotlib
pip3 install urllib3==1.26.6
```
-## Step 2: Training
+## Model Training
```bash
-# 1 GPU
+# 1 GPU Training
python3 train.py
-```
-## Step 3: Evaluating
-
-```bash
+# Evaluation
mv ../../../evaluate.py ./
python3 evaluate.py
```
-## Result
+## Model Results
-Performance of DQN playing CartPole-v0
+Performance of DQN playing CartPole-v0.
-| GPUs | Reward |
-|---------|--------|
-| BI-V100 | 200.0 |
+| Model | GPU | Reward |
+|-------|---------|--------|
+| DQN | BI-V100 | 200.0 |
## Reference
- [PARL](https://github.com/PaddlePaddle/PARL)
-- [paper](http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html)
+- [Paper](http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html)