# PaddleOCR2Pytorch **Repository Path**: dlml2/PaddleOCR2Pytorch ## Basic Information - **Project Name**: PaddleOCR2Pytorch - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-08-15 - **Last Updated**: 2025-08-20 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # [PaddleOCR2Pytorch](https://github.com/frotms/PaddleOCR2Pytorch) English | [简体中文](README.md) ## Introduction Converting PaddleOCR to PyTorch. This repository aims to - learn PaddleOCR - use models in PyTorch which are trained in Paddle - give a guideline for Paddle2PyTorch ## Notice `PytorchOCR` models are converted from `>= PaddleOCRv2.0`. **Recent updates** - 2025.05.25 **[PP-OCRv5](https://github.com/PaddlePaddle/PaddleOCR/blob/main/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5.md)**: High-Accuracy Text Recognition Model for All Scenarios - Instant Text from Images/PDFs. 1. 🌐 Single-model support for **five** text types - Seamlessly process **Simplified Chinese, Traditional Chinese, Simplified Chinese Pinyin, English** and **Japanse** within a single model. 2. ✍️ Improved **handwriting recognition**: Significantly better at complex cursive scripts and non-standard handwriting. 3. 🎯 **13-point accuracy gain** over PP-OCRv4, achieving state-of-the-art performance across a variety of real-world scenarios. - 2024.02.20 [PP-OCRv4](./doc/doc_ch/PP-OCRv4_introduction.md), support mobile version and server version - PP-OCRv4-mobile:When the speed is comparable, the effect of the Chinese scene is improved by 4.5% compared with PP-OCRv3, the English scene is improved by 10%, and the average recognition accuracy of the 80-language multilingual model is increased by more than 8%. - PP-OCRv4-server:Release the OCR model with the highest accuracy at present, the detection model accuracy increased by 4.9% in the Chinese and English scenes, and the recognition model accuracy increased by 2% - 2023.04.16 Handwritten Mathematical Expression Recognition [CAN](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/algorithm_rec_can_en.md) - 2023.04.07 Image Super-Resolution [Text Telescope](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/algorithm_sr_telescope_en.md) - 2022.10.17 Text Recognition: [ViTSTR](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/algorithm_rec_vitstr_en.md) - 2022.10.07 Text Detection: [DB++](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/algorithm_det_db_en.md) - 2022.07.24 text detection algorithms (FCENET) - 2022.07.16 text recognition algorithms (SVTR) - 2022.06.19 text recognition algorithms (SAR) - 2022.05.29 [PP-OCRv3](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.5/doc/doc_en/ppocr_introduction_en.md#pp-ocrv3): With comparable speed, the effect of Chinese scene is further improved by 5% compared with PP-OCRv2, the effect of English scene is improved by 11%, and the average recognition accuracy of 80 language multilingual models is improved by more than 5% - 2022.05.14 PP-OCRv3 text detection model - 2022.04.17 1text recognition algorithm (NRTR) - 2022.03.20 1 text detection algorithm (PSENet) - 2021.09.11 PP-OCRv2. The inference speed of PP-OCRv2 is 220% higher than that of PP-OCR server in CPU device. The F-score of PP-OCRv2 is 7% higher than that of PP-OCR mobile. - 2021.06.01 update SRN - 2021.04.25 update AAAI 2021 end-to-end algorithm PGNet - 2021.04.24 update RARE - 2021.04.12 update STARNET - 2021.04.08 update DB, SAST, EAST, ROSETTA, CRNN - 2021.04.03 update more than 25+ multilingual recognition models [models list](./doc/doc_en/models_list_en.md), including:English, Chinese, German, French, Japanese,Spanish,Portuguese Russia Arabic and so on. Models for more languages will continue to be updated [Develop Plan](https://github.com/PaddlePaddle/PaddleOCR/issues/1048). - 2021.01.10 upload Chinese and English general OCR models. ## Features - PTOCR series of high-quality pre-trained models, comparable to commercial effects - Ultra lightweight PP-OCR series models: detection + direction classifier + recognition - Ultra lightweight ptocr_mobile series models - General ptocr_server series models - Support Chinese, English, and digit recognition, vertical text recognition, and long text recognition - Support multi-language recognition: Korean, Japanese, German, French, etc. ## [Model List](./doc/doc_en/models_list_en.md) (updating) PyTorch models in BaiduPan:https://pan.baidu.com/s/1r1DELT8BlgxeOP2RqREJEg code:6clx PaddleOCR models in BaiduPan:https://pan.baidu.com/s/1getAprT2l_JqwhjwML0g9g code:lmv7 If you want to get more models including multilingual models,please refer to [PTOCR series](./doc/doc_en/models_list_en.md). ## Tutorials - [Installation](./doc/doc_en/installation_en.md) - [Inferences](./doc/doc_en/inference_en.md) - [PP-OCR Pipeline](#PP-OCR-Pipeline) - [Visualization](#Visualization) - [Reference documents](./doc/doc_en/reference_en.md) - [FAQ](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.0/doc/doc_en/FAQ_en.md) - [References](#References) ## TODO - [ ] PP-OCRv5:[Document Image Orientation Classification Module: PP-LCNet_x1_0_doc_ori](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/module_usage/doc_img_orientation_classification.html),[Text Image Rectification Module: UVDoc](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/module_usage/text_image_unwarping.html),[Text Line Orientation Classification Module: PP-LCNet_x0_25_textline_ori](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/module_usage/text_line_orientation_classification.html) - [ ] [General Document-Parsing Solution](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/pipeline_usage/PP-StructureV3.html) [PP-StructureV3](./docs/version3.x/algorithm/PP-StructureV3/PP-StructureV3.en.md): Delivers high-precision parsing of multi-layout, multi-scene PDFs, outperforming many open- and closed-source solutions on public benchmarks. - [ ] [Intelligent Document-Understanding Solution](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/pipeline_usage/PP-ChatOCRv4.html) [PP-ChatOCRv4](./docs/version3.x/algorithm/PP-ChatOCRv4/PP-ChatOCRv4.en.md): Natively powered by the WenXin large model 4.5T, achieving 15 percentage points higher accuracy than its predecessor. - [ ] Add implementation of [cutting-edge algorithms](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/algorithm_overview_en.md):Text Detection [DRRG](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/algorithm_overview_en.md), Text Recognition [RFL](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/algorithm_rec_rfl_en.md) - [ ] Text Recognition: [ABINet](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/algorithm_rec_abinet_en.md), [VisionLAN](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/algorithm_rec_visionlan_en.md), [SPIN](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/algorithm_rec_spin_en.md), [RobustScanner](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/algorithm_rec_robustscanner_en.md) - [ ] Table Recognition: [TableMaster](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/algorithm_table_master_en.md) - [ ] [PP-Structurev2](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppstructure),with functions and performance fully upgraded, adapted to Chinese scenes, and new support for [Layout Recovery](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppstructure/recovery) and **one line command to convert PDF to Word** - [ ] [Layout Analysis](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppstructure/layout) optimization: model storage reduced by 95%, while speed increased by 11 times, and the average CPU time-cost is only 41ms - [ ] [Table Recognition](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppstructure/table) optimization: 3 optimization strategies are designed, and the model accuracy is improved by 6% under comparable time consumption - [ ] [Key Information Extraction](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppstructure/kie) optimization:a visual-independent model structure is designed, the accuracy of semantic entity recognition is increased by 2.8%, and the accuracy of relation extraction is increased by 9.1% - [ ] text recognition algorithms ([SEED](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/algorithm_rec_seed_en.md)) - [ ] [key information extraction](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/ppstructure/docs/kie.md) algorithm (SDMGR) - [ ] 3 [DocVQA](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.4/ppstructure/vqa) algorithms (LayoutLM, LayoutLMv2, LayoutXLM) - [ ] a new structured documents analysis toolkit, i.e., [PP-Structure](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/ppstructure/README.md), support layout analysis and table recognition (One-key to export chart images to Excel files). ## PP-OCR Pipeline