# oar-ocr **Repository Path**: cyrs/oar-ocr ## Basic Information - **Project Name**: oar-ocr - **Description**: A comprehensive OCR library, built in Rust with ONNX Runtime for efficient inference. - **Primary Language**: Rust - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2026-01-21 - **Last Updated**: 2026-01-21 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # OAR (ONNXRuntime And Rust) OCR ![Crates.io Version](https://img.shields.io/crates/v/oar-ocr) ![Crates.io Downloads (recent)](https://img.shields.io/crates/dr/oar-ocr) [![dependency status](https://deps.rs/repo/github/GreatV/oar-ocr/status.svg)](https://deps.rs/repo/github/GreatV/oar-ocr) ![GitHub License](https://img.shields.io/github/license/GreatV/oar-ocr) A comprehensive OCR and document understanding library built in Rust with ONNX Runtime. ## Quick Start ### Installation ```bash cargo add oar-ocr ``` With GPU support: ```bash cargo add oar-ocr --features cuda ``` ### Basic Usage ```rust use oar_ocr::prelude::*; use std::path::Path; fn main() -> Result<(), Box> { // Initialize the OCR pipeline let ocr = OAROCRBuilder::new( "pp-ocrv5_mobile_det.onnx", "pp-ocrv5_mobile_rec.onnx", "ppocrv5_dict.txt", ) .build()?; // Load an image let image = load_image(Path::new("document.jpg"))?; // Run prediction let results = ocr.predict(vec![image])?; // Process results for text_region in &results[0].text_regions { if let Some((text, confidence)) = text_region.text_with_confidence() { println!("Text: {} ({:.2})", text, confidence); } } Ok(()) } ``` ### Document Structure Analysis ```rust use oar_ocr::prelude::*; use std::path::Path; fn main() -> Result<(), Box> { // Initialize structure analysis pipeline let structure = OARStructureBuilder::new("pp-doclayout_plus-l.onnx") .with_table_classification("pp-lcnet_x1_0_table_cls.onnx") .with_table_structure_recognition("slanet_plus.onnx", "wireless") .table_structure_dict_path("table_structure_dict_ch.txt") .with_ocr( "pp-ocrv5_mobile_det.onnx", "pp-ocrv5_mobile_rec.onnx", "ppocrv5_dict.txt" ) .build()?; // Analyze document let result = structure.predict("document.jpg")?; // Output Markdown println!("{}", result.to_markdown()); Ok(()) } ``` ## Vision-Language Models (VLM) For advanced document understanding using Vision-Language Models (like PaddleOCR-VL and UniRec), check out the [`oar-ocr-vl`](oar-ocr-vl/README.md) crate. ## Documentation - [**Usage Guide**](docs/usage.md) - Detailed API usage, builder patterns, GPU configuration - [**Pre-trained Models**](docs/models.md) - Model download links and recommended configurations ## Examples The `examples/` directory contains complete examples for various tasks: ```bash # General OCR cargo run --example ocr -- --help # Document Structure Analysis cargo run --example structure -- --help # Layout Detection cargo run --example layout_detection -- --help # Table Structure Recognition cargo run --example table_structure_recognition -- --help ``` ## Acknowledgments This project builds upon the excellent work of several open-source projects: - **[ort](https://github.com/pykeio/ort)**: Rust bindings for ONNX Runtime by pykeio. This crate provides the Rust interface to ONNX Runtime that powers the efficient inference engine in this OCR library. - **[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)**: Baidu's awesome multilingual OCR toolkits based on PaddlePaddle. This project utilizes PaddleOCR's pre-trained models, which provide excellent accuracy and performance for text detection and recognition across multiple languages. - **[OpenOCR](https://github.com/Topdu/OpenOCR)**: An open-source toolkit for general OCR research and applications by the FVL Laboratory at Fudan University. We use the UniRec model for unified text, formula, and table recognition. - **[Candle](https://github.com/huggingface/candle)**: A minimalist ML framework for Rust by Hugging Face. We use Candle to implement Vision-Language model inference.