mindocr/data/README_CN.md · MindSpore Lab/mindocr

数据模块指南

代码结构

├── README.md
├── __init__.py
├── base_dataset.py  				# base dataset class with __getitem__
├── builder.py					# API for create dataset and loader
├── det_dataset.py				# general text detection dataset class
├── rec_dataset.py				# general rec detection dataset class
├── rec_lmdb_dataset.py				# LMDB dataset class
└── transforms
    ├── det_transforms.py			# processing and augmentation ops (callabel classes) especially for detection tasks
    ├── general_transforms.py			# general processing and augmentation ops (callabel classes)
    ├── modelzoo_transforms.py			# transformations adopted from modelzoo
    ├── rec_transforms.py			# processing and augmentation ops (callabel classes) especially for recognition tasks
    └── transforms_factory.py			# API for create and run transforms

如何添加自己的dataset类

继承BaseDataset类
在BaseDataset中重写以下文件和标注解析函数。

def load_data_list(self, label_file: Union[str, List[str]], sample_ratio: Union[float, List] = 1.0, shuffle: bool = False, **kwargs) -> List[dict]

def _parse_annotation(self, data_line: str) -> Union[dict, List[dict]]

如何添加自己的数据转换

请参考定制化数据转换开发指导

MindSpore Lab/mindocr

数据模块指南

代码结构

如何添加自己的dataset类

如何添加自己的数据转换

简介

发行版

贡献者

语言

近期动态

MindSpore Lab/mindocr .gitee-modal { width: 500px !important; }

数据模块指南

代码结构

如何添加自己的dataset类

如何添加自己的数据转换

简介

发行版

开源评估指数源自 OSS-Compass 评估体系，评估体系围绕以下三个维度对项目展开评估：

贡献者

语言

近期动态

搜索帮助

MindSpore Lab/mindocr