# PaddleOCR

**Repository Path**: superchao1982/PaddleOCR

## Basic Information

- **Project Name**: PaddleOCR
- **Description**: Paddle OCR特性：
    超轻量级中文OCR，总模型仅8.6M
        单模型支持中英文数字组合识别、竖排文本识别、长文本识别
        检测模型DB（4.1M）+识别模型CRNN（4.5M）
    多种文本检测训练算法，EAST、DB
    多种文本识别训练算法，Rosetta、CRNN、STAR-Net、RARE
- **Primary Language**: Python
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 2
- **Created**: 2025-11-18
- **Last Updated**: 2025-11-18

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# 🚀 **PaddleOCR 光学字符识别全流程指南**

> 🔍 基于PaddlePaddle的端到端OCR工具库，支持多语言识别、文本检测与识别

## 🛠️ **一、环境搭建** 

### 📥 1. 安装PaddleOCR框架
```bash
# 切换到项目目录
%cd ~/Project

# 克隆PaddleOCR官方仓库（发布版本2.1）
!git clone -b release/2.1 https://github.com/PaddlePaddle/PaddleOCR.git

# 创建推理模型目录
!mkdir inference

# 下载并解压超轻量级中文OCR检测模型
!cd inference && wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && rm ch_ppocr_mobile_v2.0_det_infer.tar

# 下载并解压超轻量级中文OCR识别模型
!cd inference && wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar && rm ch_ppocr_mobile_v2.0_rec_infer.tar

# 下载并解压超轻量级中文OCR文本方向分类器模型
!cd inference && wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar && tar xf ch_ppocr_mobile_v2.0_cls_infer.tar && rm ch_ppocr_mobile_v2.0_cls_infer.tar
```

### 📦 2. 备选模型下载方式
```bash
# 🎯 备选下载路径（确保模型文件完整性）
%cd ~/Project/PaddleOCR
!mkdir inference && cd inference

# 检测模型
!wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && rm ch_ppocr_mobile_v2.0_det_infer.tar

# 识别模型  
!wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar && rm ch_ppocr_mobile_v2.0_rec_infer.tar

# 方向分类器模型
!wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar && tar xf ch_ppocr_mobile_v2.0_cls_infer.tar && rm ch_ppocr_mobile_v2.0_cls_infer.tar
```

### 🔧 3. 安装依赖库
```bash
# 📚 安装PaddleOCR所需依赖包
%cd ~/Project/PaddleOCR
!pip install -r requirements.txt -i https://mirror.baidu.com/pypi/simple
```

### 🎯 4. 快速单张图像预测
```python
# 🔍 快速OCR预测演示
%cd ~/Project/PaddleOCR

import matplotlib.pyplot as plt
from PIL import Image
%pylab inline

def show_img(img_path, figsize=(10,10)):
    """
    🖼️ 图像显示函数
    参数:
        img_path: 图像文件路径
        figsize: 显示尺寸
    """
    img = Image.open(img_path)
    plt.figure("OCR测试图像", figsize=figsize)
    plt.imshow(img)
    plt.axis('off')
    plt.show()

# 显示测试图像
show_img("./doc/imgs/french_0.jpg")

# 🚀 执行OCR识别
!python3 tools/infer/predict_system.py \
    --image_dir="./doc/imgs/french_0.jpg" \
    --det_model_dir="/home/aistudio/Project/PaddleOCR/inference/ch_ppocr_mobile_v2.0_det_infer" \
    --rec_model_dir="/home/aistudio/Project/PaddleOCR/inference/ch_ppocr_mobile_v2.0_rec_infer" \
    --cls_model_dir="/home/aistudio/Project/PaddleOCR/inference/ch_ppocr_mobile_v2.0_cls_infer"

print("✅ OCR识别完成！结果保存在 ./inference_results 文件夹")
```

## 🎯 **二、【文本检测】模型训练**

### 📋 数据集准备指南
[📁 数据集与配置文件准备说明](https://gitee.com/downeytian_jr/PaddleOCR/blob/master/help/rec_train.md) 🔗

### 🏋️‍♂️ 1. 下载预训练模型
```bash
# 📥 下载骨干网络预训练权重
%cd ~/Project/PaddleOCR

# MobileNetV3预训练模型
!wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV3_large_x0_5_pretrained.tar
!cd pretrain_models/ && tar xf MobileNetV3_large_x0_5_pretrained.tar

# ResNet50预训练模型  
!wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_ssld_pretrained.tar
!cd pretrain_models/ && tar xf ResNet50_vd_ssld_pretrained.tar

print("🎉 预训练模型下载完成！")
```

### 🚀 2. 训练文本检测模型
```bash
# 🔥 基于MobileNetV3骨干网络的DB文本检测模型训练
%cd ~/Project

!python3 PaddleOCR/tools/train.py -c PaddleOCR/configs/det/det_mv3_db.yml -o \
    Global.epoch_num=1200 \
    Global.eval_batch_step="[0,500]" \
    Global.save_epoch_step=10 \
    Global.save_model_dir="./output/Tyre_Defects_detection/det" \
    Global.save_res_path="./output/Tyre_Defects_detection/det/predicts_db.txt" \
    Train.dataset.data_dir='./train_data/Tyre_Defects_detection/' \
    Train.dataset.label_file_list=['./train_data/Tyre_Defects_detection/train_label.txt'] \
    Eval.dataset.data_dir='./train_data/Tyre_Defects_detection/' \
    Eval.dataset.label_file_list=['./train_data/Tyre_Defects_detection/test_label.txt']

print("🎊 文本检测模型训练完成！")
```

### 🔍 3. 文本检测模型推理
```bash
# 🔮 使用训练好的检测模型进行文本区域检测
%cd ~/Project

# 显示测试图像
show_img("./test/test2.jpg")

# 执行文本检测
!python3 PaddleOCR/tools/infer_det.py -c PaddleOCR/configs/det/det_mv3_db.yml -o \
    Global.infer_img="./test/test2.jpg" \
    Global.checkpoints="./output/Tyre_Defects_detection/det/latest"

print("✅ 文本检测完成！")
```

## ✍️ **三、【文字识别】模型训练**

### 📋 数据集准备指南  
[📁 数据集与配置文件准备说明](https://gitee.com/downeytian_jr/PaddleOCR/blob/master/help/det_train.md) 🔗

### 📥 1. 下载识别预训练模型
```bash
# 🎯 下载文字识别预训练模型
!wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/rec_mv3_none_bilstm_ctc.tar
!cd pretrain_models && tar -xf rec_mv3_none_bilstm_ctc.tar && rm -rf rec_mv3_none_bilstm_ctc.tar

print("🎉 文字识别预训练模型下载完成！")
```

### 🚀 2. 训练文字识别模型
```bash
# 🔥 基于ICDAR15数据集的文字识别模型训练
%cd ~/Project

!python3 PaddleOCR/tools/train.py -c PaddleOCR/configs/rec/rec_icdar15_train.yml -o \
    Global.eval_batch_step="[0,500]" \
    Global.save_epoch_step=10 \
    Global.save_model_dir='./output/Tyre_Defects_detection/rec/' \
    Train.dataset.data_dir='./train_data/Tyre_Defects_detection' \
    Train.dataset.label_file_list=['./train_data/Tyre_Defects_detection/rec_gt_train.txt'] \
    Eval.dataset.data_dir='./train_data/Tyre_Defects_detection' \
    Eval.dataset.label_file_list=['./train_data/Tyre_Defects_detection/rec_gt_test.txt'] \
    Optimizer.lr.learning_rate=0.0001

print("🎊 文字识别模型训练完成！")
```

### 🔍 3. 文字识别模型推理
```bash
# 🔮 使用训练好的识别模型进行文字识别
%cd ~/Project

# 显示测试图像
show_img("./test/test1.jpg")

# 执行文字识别
!python3 PaddleOCR/tools/infer_rec.py -c PaddleOCR/configs/rec/rec_icdar15_train.yml -o \
    Global.infer_img="./test/test1.jpg" \
    Global.checkpoints="./output/Tyre_Defects_detection/rec/latest"

print("✅ 文字识别完成！")
```

---

## 💡 **使用技巧与最佳实践**

### 🎯 **模型选择建议**
- **📱 移动端部署**: 使用ch_ppocr_mobile系列（轻量级）
- **💻 服务端部署**: 使用ch_ppocr_server系列（高精度）  
- **🔄 自定义训练**: 基于预训练模型进行微调

### ⚡ **性能优化策略**
- **🎛️ 批量推理**: 对多张图像进行批量处理提升效率
- **🔧 模型量化**: 使用PaddleSlim进行模型压缩
- **🚀 多线程**: 启用多线程加速数据预处理

### 🛠️ **常见问题排查**
- **📊 内存不足**: 减小batch_size或输入图像尺寸
- **🔍 识别不准**: 检查训练数据质量和标注准确性
- **⚙️ 环境配置**: 确保CUDA和cuDNN版本兼容

### 🌟 **扩展功能**
- **🌍 多语言支持**: 支持80+语言识别
- **📐 表格识别**: 专用表格结构识别模型
- **🎨 公式识别**: LaTeX数学公式识别
- **🖼️ 版面分析**: 文档版面元素检测

---

<div align="center">

## 🎉 **恭喜！您已掌握PaddleOCR全流程开发！**

**从环境搭建到自定义训练，现在您可以：**
- ✅ 快速部署OCR服务
- ✅ 训练领域专用模型  
- ✅ 优化识别精度与速度
- ✅ 集成到实际应用中

</div>

---

**📚 更多资源:**
- [PaddleOCR官方文档](https://github.com/PaddlePaddle/PaddleOCR)
- [模型库与预训练权重](https://paddleocr.bj.bcebos.com/)
- [社区论坛与技术支持](https://ai.baidu.com/forum/topic/list/168)

**🔄 持续更新中...**