# Draft **Repository Path**: cuicheng01/draft ## Basic Information - **Project Name**: Draft - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2025-05-19 - **Last Updated**: 2025-05-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README

PaddleOCR Banner

English | [简体中文](./readme_c.md)| [日本語](./README_ja.md) [![stars](https://img.shields.io/github/stars/PaddlePaddle/PaddleOCR?color=ccf)](https://github.com/PaddlePaddle/PaddleOCR) [![license](https://img.shields.io/badge/License-Apache%202-dfd)](./LICENSE) [![Downloads](https://img.shields.io/pypi/dm/paddleocr)](https://pypi.org/project/PaddleOCR/) [![Discord](https://img.shields.io/badge/Chat-on%20discord-7289da.svg?sanitize=true)](https://discord.gg/z9xaRVjdbD) [![X (formerly Twitter) URL](https://img.shields.io/twitter/follow/PaddlePaddle)](https://x.com/PaddlePaddle) ![python](https://img.shields.io/badge/python-3.8+-aff.svg) ![os](https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg) [![Website](https://img.shields.io/badge/Website-PaddleOCR-blue?logo=)](https://www.paddleocr.ai/) [![AI Studio](https://img.shields.io/badge/Demo-AI%20Studio-green)](https://aistudio.baidu.com/community/app/91660/webUI) [![HuggingFace](https://img.shields.io/badge/Demo_on_HuggingFace-yellow.svg?logo=&labelColor=white)](https://huggingface.co/spaces/PaddlePaddle/PaddleOCR) [![ModelScope](https://img.shields.io/badge/Demo_on_ModelScope-purple?logo=&labelColor=white)](https://www.modelscope.cn/organization/PaddlePaddle) [![Paper](https://img.shields.io/badge/Paper-arXiv-green)](https://arxiv.org/pdf/2206.03001)

## 🚀 Introduction Built on years of foundational research and real-world industry practice, PaddleOCR offers state-of-the-art solutions including the [PP-OCR](https://github.com/PaddlePaddle/PaddleOCR/blob/v2.7.0/doc/doc_ch/ppocr_introduction.md) series of models, the document parsing system [PP-Structure](https://github.com/PaddlePaddle/PaddleOCR/blob/v2.7.0/ppstructure/README_ch.md), and the key information extraction tool [PP-ChatOCR](https://aistudio.baidu.com/aistudio/projectdetail/6488689), all powered by [paddlepaddle](https://github.com/PaddlePaddle/Paddle). Our models and tools are continuously updated to ensure **high accuracy**, **flexibility**, and **easy of use**. Additionally, users can annotate their own images using [PPOCRLabelv2](https://github.com/PFCCLab/PPOCRLabel) and [fine-tune](https://github.com/PaddlePaddle/PaddleOCR/blob/v2.7.0/doc/doc_ch/finetune.md) models with just a single command.

PaddleOCR Demo

You can [Quick Start](#-quick-start) directly, find comprehensive documentation in the [PaddleOCR Docs](https://paddlepaddle.github.io/PaddleOCR/main/index.html), get support via [Github Issus](https://github.com/PaddlePaddle/PaddleOCR/issues), and explore our OCR courses on [OCR courses on AIStudio](https://aistudio.baidu.com/course/introduce/25207). ## 🌐 Architecture Overview **PaddleOCR** is a modular OCR toolkit that offers ready-to-use models and solutions for OCR and document parsing. The latest offerings include: - 🖼️[PP-OCRv5](): High-Precision Text Recognition Model for All Scenarios - Instant Text from Images/PDFs. - 🧮[PP-StructureV3](): High-Precision Document Parsing Solution – Unleash SOTA Images/PDFs Parsing for Real-World Scenarios! - 📈[PP-ChatOCRv4](): Intelligent Key Information Extraction Solution – Extract Key Information, not just text from Images/PDFs.

PaddleOCR Architecture

## 📣 Recent updates 🔥🔥2025.05.30: Release of **PaddleOCR v3.0**, including: - **PP-OCRv5**: High-Precision Text Recognition Model for All Scenarios - Instant Text from Images/PDFs. 1. 🌐 Simultaneous Support for **5** types of text - Seamlessly process **Simplified Chinese, Traditional Chinese, Simplified Chinese Pinyin, English** and **Japanse** within a single model. 2. 🎯 Elevated Overall Text Recognition Accuracy - Achieves SOTA precision across diverse use cases. 3. ✍️ Revolutionized **Handwritten Text Recognition** - Delivers breakthrough performance for irregular, cursive, and complex scripts. - **PP-StructureV3**: High-Precision Document Parsing Solution – Unleash SOTA Images/PDFs Parsing for Real-World Scenarios! 1. 🧮 Enables multi-scenario high-precision PDF parsing, achieving SOTA accuracy on the OmniDocBench benchmark among open-source solutions. 2. ⚡ Supports **multi-GPU parallel inference** and multi-GPU instance service deployment: - High-precision configuration achieves **XXX** QPS on 4×V100 GPUs. - High-efficiency configuration achieves **XXX** QPS on 4×V100 GPUs. 3. 🧠 Advanced capabilities include **seal recognition, chart-to-table conversion, table recognition with nested formulas/images, vertical text document parsing, and complex table structure analysis**. - **PP-ChatOCRv4**: Intelligent Key Information Extraction Solution – Extract Key Information, not just text from Images/PDFs. 1. 🔥 Delivers high-accuracy key information extraction from document files including PDF/PNG formats, surpassing PP-ChatOCRv3 by **15.7%** in accuracy 2. 🤝 Integrated with [PP-DocBeeV2](https://github.com/PaddlePaddle/PaddleMIX/tree/develop/paddlemix/examples/ppdocbee), supports extracting key information from charts and images within documents 3. 💻 Supports local offline deployment of LLMs/MLLMs, and allows seamless integration of large language models deployed via tools like [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP), Ollama, vLLM into PP-ChatOCRv4
The history of updates - 🔥🔥2025.03.07: Release of **PaddleOCR v2.10**, including: - **12 new self-developed models:** - **[Layout Detection series](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/layout_detection.html)**(3 models): PP-DocLayout-L, M, and S -- capable of detecting 23 common layout types across diverse document formats(papers, reports, exams, books, magazines, contracts, etc.) in English and Chinese. Achieves up to **90.4% mAP@0.5** , and lightweight features can process over 100 pages per second. - **[Formula Recognition series](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/formula_recognition.html)**(2 models): PP-FormulaNet-L and S -- supports recognition of 50,000+ LaTeX expressions, handling both printed and handwritten formulas. PP-FormulaNet-L offers **6% higher accuracy** than comparable models; PP-FormulaNet-S is 16x faster while maintaining similar accuracy. - **[Table Structure Recognition series](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/table_structure_recognition.html)**(2 models): SLANeXt_wired and SLANeXt_wireless -- newly developed models with **6% accuracy improvement** over SLANet_plus in complex table recognition. - **[Table Classification](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/table_classification.html)**(1 model): PP-LCNet_x1_0_table_cls -- an ultra-lightweight classifier for wired and wireless tables. [Learn more](https://paddlepaddle.github.io/PaddleOCR/latest/en/update.html)
## ⚡ Quick Start ### 1. Run online demo without installation [![Website](https://img.shields.io/badge/Website-PaddleOCR-blue?logo=)](https://www.paddleocr.ai/) [![AI Studio](https://img.shields.io/badge/Demo-AI%20Studio-green)](https://aistudio.baidu.com/community/app/91660/webUI) [![HuggingFace](https://img.shields.io/badge/Demo_on_HuggingFace-yellow.svg?logo=&labelColor=white)](https://huggingface.co/spaces/PaddlePaddle/PaddleOCR) [![ModelScope](https://img.shields.io/badge/Demo_on_ModelScope-purple?logo=&labelColor=white)](https://www.modelscope.cn/organization/PaddlePaddle) ### 2. Installation First, please install PaddlePaddle using the official [Installation Guide](https://www.paddlepaddle.org.cn/en/install/quick?docurl=/documentation/docs/en/develop/install/pip/linux-pip_en.html). Then, install the PaddleOCR toolkit. #### 2.1 CPU 环境 ```bash # 1. Install PaddlePaddle pip install paddlepaddle # 2. Install PaddleOCR pip install paddleocr # 3. Self-check after installation is complete paddleocr --version ``` #### 2.2 NVIDIA GPU 环境 ```bash # 1. Install the CUDA 11.8 version of paddlepaddle-gpu python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/ # Or install the CUDA 12.6 version of paddlepaddle-gpu python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/ # 2. Install PaddleOCR pip install paddleocr # 3. Self-check after installation is complete paddleocr --version ``` #### 2.3 More AI Accelerators [Huawei Ascend](README_en.md) | [Kunlunxin](README.md)| Adding more ### 3. Run inference by CLI ```bash # Run PP-OCRv5 inference paddleocr ocr -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png # Run PP-StructureV3 inference paddleocr PP-StructureV3 -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pp_structure_v3_demo.png # Run PP-ChatOCRv4 inference paddleocr pp_chatocrv4_doc -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_certificate-1.png -k 驾驶室准乘人数 --qianfan_api_key your_api_key # Get more information about "paddleocr ocr" paddleocr ocr --help ``` ### 4. Run inference by API #### 4.1 PP-OCRv5 Example ```python from paddleocr import PaddleOCR # Initialize PaddleOCR instance ocr = PaddleOCR() # Run OCR inference on a sample image result = ocr.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png") # Visualize the results and save the JSON results for res in result: res.print() res.save_to_img("output") res.save_to_json("output") ```
4.2 PP-StructureV3 Example ```python from pathlib import Path from paddleocr import PPStructureV3 pipeline = PPStructureV3() # For Image output = pipeline.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pp_structure_v3_demo.png") # Visualize the results and save the JSON results for res in output: res.print() res.save_to_json(save_path="output") res.save_to_markdown(save_path="output") # For PDF File input_file = "./your_pdf_file.pdf" output_path = Path("./output") output = pipeline.predict(input_file) markdown_list = [] markdown_images = [] for res in output: md_info = res.markdown markdown_list.append(md_info) markdown_images.append(md_info.get("markdown_images", {})) markdown_texts = pipeline.concatenate_markdown_pages(markdown_list) mkd_file_path = output_path / f"{Path(input_file).stem}.md" mkd_file_path.parent.mkdir(parents=True, exist_ok=True) with open(mkd_file_path, "w", encoding="utf-8") as f: f.write(markdown_texts) for item in markdown_images: if item: for path, image in item.items(): file_path = output_path / path file_path.parent.mkdir(parents=True, exist_ok=True) image.save(file_path) ```
4.3 PP-ChatOCRv4 Example ```python from paddleocr import PPChatOCRv4Doc chat_bot_config = { "module_name": "chat_bot", "model_name": "ernie-3.5-8k", "base_url": "https://qianfan.baidubce.com/v2", "api_type": "openai", "api_key": "api_key", # your api_key } retriever_config = { "module_name": "retriever", "model_name": "embedding-v1", "base_url": "https://qianfan.baidubce.com/v2", "api_type": "qianfan", "api_key": "api_key", # your api_key } mllm_chat_bot_config = { "module_name": "chat_bot", "model_name": "PP-DocBee", "base_url": "http://127.0.0.1:8080/", # your local mllm service url "api_type": "openai", "api_key": "api_key", # your api_key } pipeline = PPChatOCRv4Doc() visual_predict_res = pipeline.visual_predict( input="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_certificate-1.png", use_doc_orientation_classify=False, use_doc_unwarping=False, use_common_ocr=True, use_seal_recognition=True, use_table_recognition=True, ) visual_info_list = [] for res in visual_predict_res: visual_info_list.append(res["visual_info"]) layout_parsing_result = res["layout_parsing_result"] vector_info = pipeline.build_vector( visual_info_list, flag_save_bytes_vector=True, retriever_config=retriever_config ) mllm_predict_res = pipeline.mllm_pred( input="vehicle_certificate-1.png", key_list=["驾驶室准乘人数"], mllm_chat_bot_config=mllm_chat_bot_config, ) mllm_predict_info = mllm_predict_res["mllm_res"] chat_result = pipeline.chat( key_list=["驾驶室准乘人数"], visual_info=visual_info_list, vector_info=vector_info, mllm_predict_info=mllm_predict_info, chat_bot_config=chat_bot_config, retriever_config=retriever_config, ) print(chat_result) ```
## 📚 Get Started From OCR Courses: - [AI快车道2020-PaddleOCR](https://aistudio.baidu.com/course/introduce/1519) ## 😃 Awesome Projects Leveraging PaddleOCR 💗 PaddleOCR wouldn’t be where it is today without its incredible community! A massive 🙌 thank you 🙌 to all our longtime partners, new collaborators, and everyone who’s poured their passion into PaddleOCR — whether we’ve named you or not. Your support fuels our fire! 🔥 | Project Name | Description | | ------------ | ----------- | | [RAGFlow](https://github.com/infiniflow/ragflow) |RAG engine based on deep document understanding.| | [MinerU](https://github.com/opendatalab/MinerU) |Multi-type Document to Markdown Conversion Tool| | [Umi-OCR](https://github.com/hiroi-sora/Umi-OCR) |Free, Open-source, Batch Offline OCR Software.| | [OmniParser](https://github.com/microsoft/OmniParser) |OmniParser: Screen Parsing tool for Pure Vision Based GUI Agent.| | [QAnything](https://github.com/netease-youdao/QAnything) |Question and Answer based on Anything.| | [PDF-Extract-Kit](https://github.com/opendatalab/PDF-Extract-Kit) |A powerful open-source toolkit designed to efficiently extract high-quality content from complex and diverse PDF documents.| | [Dango-Translator](https://github.com/PantsuDango/Dango-Translator) |Recognize text on the screen, translate it and show the translation results in real time.| | [Learn more projects](./awesome_projects.md) | [More projects based on PaddleOCR](./awesome_projects.md)| ## 👩‍👩‍👧‍👦 Community * 👫 Join the [PaddlePaddle Community](https://github.com/PaddlePaddle/community), where you can engage with [paddlepaddle developers](https://www.paddlepaddle.org.cn/developercommunity), researchers, and enthusiasts from around the world. * 🎓 Learn from experts through workshops, tutorials, and Q&A sessions [hosted by the AI Studio](https://aistudio.baidu.com/learn/center). * 🏆 Participate in [hackathons, challenges, and competitions](https://aistudio.baidu.com/competition) to showcase your skills and win exciting prizes. * 📣 Stay updated with the latest news, announcements, and events by following our [Twitter](https://x.com/PaddlePaddle) and [WeChat](https://mp.weixin.qq.com/s/MAdo7fZ6dfeGcCQUtRP2ag). Let’s build the future of AI together! 🚀 ## 📄 License This project is released under [Apache License Version 2.0](./LICENSE).