# Document-Layout-Analysis **Repository Path**: zetingh/Document-Layout-Analysis ## Basic Information - **Project Name**: Document-Layout-Analysis - **Description**: Tools for extract figure, table, text, .. from a pdf document. - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: dev - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2021-03-16 - **Last Updated**: 2021-10-28 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Document-Layout-Analysis Tools for extract figure, table, text,... from a pdf document ## Installation ``` $ pip install -r requirements.txt ``` ### Install detectron2 Requirment - CUDA=10.1 - PyTorch>=1.7.0 How to install CUDA 10.1 can be found here: https://developer.nvidia.com/cuda-10.1-download-archive-base How to install PyTorch can be found here: https://pytorch.org/ Afer installed above package, follow the instructions below to install detectron2: ``` $ git clone https://github.com/facebookresearch/detectron2.git $ git checkout 8e3effc $ python -m pip install -e detectron2 ``` ### Install Document-Layout-Analysis Follow the instructions below: ``` $ git clone -b dev https://github.com/Wild-Rift/Document-Layout-Analysis.git $ cd Document-Layout-Analysis ``` ## Train ### Dataset We use [IBM Publaynet](https://developer.ibm.com/technologies/artificial-intelligence/data/publaynet/) dataset for training and testing. It includes 358,353 images, 335,703 training images, 11,245 validation images and 11,405 test images. The category-id label mapping of this dataset is: | Category id | Label | | :---: | :--- | | 1 | Text | | 2 | Title | | 3 | List | | 4 | Table | | 5 | Figure | After download and extract dataset, please put it in ```datasets``` directory. The directories should be arranged like this: root ├── mmdet ├── tools ├── configs ├── output │ ├──... │ ├── datasets │ ├── publaynet │ │ ├── test/ │ │ ├── train/ │ │ ├── val/ │ │ ├── train.json │ │ ├── val.json ### Training Document-Layout-Analysis support training on two models: Faster-RCNN and Mask-RCNN ``` $ CONFIG_FILE='configs/faster_rcnn_R_101_FPN_3x.yaml' # if use Faster-RCNN model $ CONFIG_FILE='configs/mask_rcnn_R_101_FPN_3x.yaml' # if use Mask-RCNN model ``` If you want to inspect model's structures, go to ```configs``` directory If you want to training on 8 GPU, run: ``` $ python train.py --num-gpus 8 --config-file CONFIG_FILE ``` If you want to training on 1 GPU, you may need to [change some parameters](https://arxiv.org/abs/1706.02677), run: ``` $ python train.py --num-gpus 1 \ --config-file CONFIG_FILE \ SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025 ``` Checkpoints of model will be store in ```output``` directory after each epoch.