# CBCT_seg **Repository Path**: peakb_admin/CBCT_seg ## Basic Information - **Project Name**: CBCT_seg - **Description**: https://github.com/RomainDeleat-besson/CBCT_seg - **Primary Language**: Unknown - **License**: Unlicense - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-06-17 - **Last Updated**: 2024-06-17 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # CBCT_seg Authors: Deleat-Besson Romain (UoM), Le Celia (UoM) Contributor: Bert Loris (NYU) Scripts for MandSeg and RootCanalSeg projects ## Prerequisites python 3.7.9 with the librairies: **Main librairies:** > tensorflow==2.4.1 \ > tensorflow-gpu==2.4.0 \ > Pillow==7.2.0 \ > numpy==1.19.5 \ > itk==5.2.0 **Detailed librairies** > requirements.txt ## What is it? CBCT_seg is a tool for CBCT segmentation based on a machine learning approach. The Convolutional Neural Network (CNN) used is a 2D U-Net. It takes several CBCT scans as inputs, with the extensions: .nii | nii.gz, .gipl | .gipl.gz, .nrrd ## Running the code MandSeg and RCSeg algorithms have different parameters therefore 4 bash scripts were created to run the training and prediction algorithms. **Prediction** To run the prediction algorithm, run the folowing command line: ``` bash src/sh/main_prediction_MandSeg.sh --help ``` ``` bash src/sh/main_prediction_RCSeg.sh --help ``` ``` the input parameters are: --dir_src Path to the Folder that contains the source code --file_input Scan to segment --dir_preproc Folder to save the preprocessed images --dir_predicted Folder to save the predicted images --dir_output Folder to save the postprocessed images The optionnal parameters are: --width Width of the images --height Height of the images --tool_name Tool name [MandSeg | RCSeg] --threshold Threshold to use to binarize scans in postprocess. (-1 for otsu | [0;255] for a specific value) -h|--help Print this Help ``` **Training** To run the training algorithm, run the folowing command line: ``` bash src/sh/main_training_MandSeg.sh --help ``` ``` bash src/sh/main_training_RCSeg.sh --help ``` ``` the input parameters are: --dir_project Folder containing the project. --dir_src Folder containing the scripts. --dir_data Folder containing all the data of the project. --dir_database Folder containing the database for the training/testing. --dir_cv Folder to save the cross-validation folds. --dir_test_preproc Folder to save the preprocessed testing images --dir_train_preproc Folder to save the preprocessed training images --dir_model Folder to save the models --dir_log Folder to save the logs of the model --dir_test_predict Folder to save the predicted testing images --dir_train_predict Folder to save the predicted training images --dir_test_postproc Folder to save the postprocessed testing images --dir_train_postproc Folder to save the postprocessed training images the optionnal parameters are: --cv_folds Number of folds for the cross validation. --testing_percentage Percentage of images to keep for testing --min_percentage Min percentage to threshold images for preprocessing --max_percentage Max percentage to threshold images for preprocessing --model_name Name of the model --epochs Number of epochs for training the models --save_frequence Frequence of saving the models --width Width of the images --height Height of the images --learning_rate Learning rate --batch_size Batch size --NumberFilters Number of filters --dropout Dropout --ratio Ration of slices outside of the region of interest to remove (value between [0;1]) --num_epoch Number of the epoch of the model to select for the prediction --tool_name Name of the tool used --threshold Threshold to use to binarize scans in postprocess. (-1 for otsu | [0;255] for a specific value) --out_metrics_val File to save the evaluation metrics of the models on the validation set --out_metrics_testing File to save the evaluation metrics of the models on the testing set -h|--help Print this Help. ``` ### Docker You can get the CBCT_seg docker image by running the folowing command line: ``` docker pull dcbia/cbct-seg:latest ``` *MandSeg:* To run MandSeg inside the docker container, run the following command line: ``` docker run --rm -v */my/input/file*:/app/scans/$(basename */my/input/file*) -v */my/output/folder*:/app/out cbct-seg:latest main_prediction_MandSeg.sh --dir_src /app/src --file_input /app/scans/$(basename */my/input/file*) --dir_output /app/out --path_model /app/model/MandSeg_Final_35.hdf5 ``` *RCSeg:* To run RCSeg inside the docker container, run the following command line: ``` docker run --rm -v */my/input/file*:/app/scans/$(basename */my/input/file*) -v */my/output/folder*:/app/out cbct-seg:latest main_prediction_RCSeg.sh --dir_src /app/src --file_input /app/scans/$(basename */my/input/file*) --dir_output /app/out --path_model /app/model/RCSeg_Final_50.hdf5 ``` ### Creation of the workspace ``` python3 src/py/generate_workspace.py --help ``` input: folder containing all the scans and segmentations output: folder containing the scans and segmentations divided into training/testing(/validation) Takes a folder, searches into all the subfolders and seperates the scans and the segmentations. Moves thoses files into the output folder. A training folder and a testing folder are created. The training folder is divided into the specified number of folds, or, if the number of folds is 0, a validation folder can be created. The percentage or number of files for testing and validation is selected randomly inside each folder, according to the propotion of files in each of them, to prevent class imbalance. ``` usage: generate_workspace.py [-h] --dir DIR --out OUT [--cv_folds CV_FOLDS] [--testing_number TESTING_NUMBER | --testing_percentage TESTING_PERCENTAGE] [--validation_number VALIDATION_NUMBER | --validation_percentage VALIDATION_PERCENTAGE] Creation of the workspace (training, validation, testing | cross validation) optional arguments: -h, --help show this help message and exit Input files: --dir DIR Input directory with 3D images (default: None) Output parameters: --out OUT Output directory (default: None) --cv_folds CV_FOLDS Number of folds to create (default: 8) --testing_number TESTING_NUMBER Number of scans to keep for testing (default: None) --testing_percentage TESTING_PERCENTAGE Percentage of scans to keep for testing (default: 20) --validation_number VALIDATION_NUMBER Number of scans to keep for validation (default: None) --validation_percentage VALIDATION_PERCENTAGE Percentage of scans to keep for validation (default: 10) ``` ### Pre-Processing ``` python3 src/py/preprocess.py --help ``` input: 3D CBCT scans (.nii | nii.gz, .gipl | .gipl.gz, .nrrd) output: 2D .png slices from the scans Takes a single image or a directory, performs contrast adjustment and deconstructs the 3D scan into 2D slices of the specified size. ``` usage: preprocess.py [-h] (--image IMAGE | --dir DIR) [--desired_width DESIRED_WIDTH] [--desired_height DESIRED_HEIGHT] [--min_percentage MIN_PERCENTAGE] [--max_percentage MAX_PERCENTAGE] --out OUT Pre-processing optional arguments: -h, --help show this help message and exit Input files: --image IMAGE Input 3D image (default: None) --dir DIR Input directory with 3D images (default: None) Resizing parameters: --desired_width DESIRED_WIDTH desired width of the images (default: 512) --desired_height DESIRED_HEIGHT desired width of the images (default: 512) Contrast parameters: --min_percentage MIN_PERCENTAGE min percentage to adjust contrast of the images (default: 45) --max_percentage MAX_PERCENTAGE max percentage to adjust contrast of the images (default: 90) Output parameters: --out OUT Output directory (default: None) ``` ``` python3 src/py/labels_preprocess.py --help ``` input: 3D CBCT labels (.nii | nii.gz, .gipl | .gipl.gz, .nrrd) output: 2D .png slices from the labels Takes a single label or a directory and deconstructs the 3D scan into 2D slices of the specified size. ``` usage: labels_preprocess.py [-h] (--image IMAGE | --dir DIR) [--desired_width DESIRED_WIDTH] [--desired_height DESIRED_HEIGHT] --out OUT Label pre-processing optional arguments: -h, --help show this help message and exit Input files: --image IMAGE Input 3D label (default: None) --dir DIR Input directory with 3D labels (default: None) Resizing parameters: --desired_width DESIRED_WIDTH width of the images (default: 512) --desired_height DESIRED_HEIGHT height of the images (default: 512) Output parameters: --out OUT Output directory of the label slices (default: None) ``` ### Training #### Data augmentation The data augmentation used in the training applies random rotation, shift, shear and zoom. ``` python3 src/py/heat_map.py --help ``` input: Database that contains the labels output: Save the the heat map Visualisation of the data augmentation that is applied on your dataset ``` usage: heat_map.py [-h] --dir_database DIR_DATABASE [--width WIDTH] [--height HEIGHT] --out OUT Visualization of the data augmentation optional arguments: -h, --help show this help message and exit Input files: --dir_database DIR_DATABASE Input dir of the dataset (default: None) label parameters: --width WIDTH width of the images (default: 512) --height HEIGHT height of the images (default: 512) Output parameters: --out OUT Output file (default: None) ``` #### Training model The neural network choosen is a U-Net architecture. The loss function is the *BinaryCrossentropy*. The metrics monitered during the training were the *Recall* and *Precision* metrics. ``` python3 src/py/training_seg.py --help ``` input: The scans and labels for the training output: The trained models, saved every *save_frequence* epochs The algorithm takes as an input the training fold. It will use one of the cross validation folder as a validation set and the rest as a training set. The *save_frequence* parameter allow you to save the model at specific epochs. ``` usage: training_seg.py [-h] --dir_train DIR_TRAIN --save_model SAVE_MODEL --log_dir LOG_DIR (--val_folds VAL_FOLDS [VAL_FOLDS ...] | --val_dir VAL_DIR) [--model_name MODEL_NAME] [--epochs EPOCHS] [--ratio RATIO] [--save_frequence SAVE_FREQUENCE] [--learning_rate_schedular LEARNING_RATE_SCHEDULAR] [--width WIDTH] [--height HEIGHT] [--batch_size BATCH_SIZE] [--learning_rate LEARNING_RATE] [--number_filters NUMBER_FILTERS] [--dropout DROPOUT] Training a neural network optional arguments: -h, --help show this help message and exit --val_folds VAL_FOLDS [VAL_FOLDS ...] Fold of the cross-validation to keep for validation (default: None) --val_dir VAL_DIR Directory for the validation dataset (default: None) Input files: --dir_train DIR_TRAIN Input training folder (default: None) --save_model SAVE_MODEL Directory to save the model (default: None) --log_dir LOG_DIR Directory for the logs of the model (default: None) training parameters: --model_name MODEL_NAME Name of the model (default: CBCT_seg_model) --epochs EPOCHS Number of epochs (default: 100) --ratio RATIO Ratio of slices outside of the region of interest to remove (between 0 and 1) (default: 0) --save_frequence SAVE_FREQUENCE Epoch frequence to save the model (default: 5) --learning_rate_schedular LEARNING_RATE_SCHEDULAR Set the LRS (default: None) --width WIDTH width of the images (default: 512) --height HEIGHT height of the images (default: 512) --batch_size BATCH_SIZE Batch size value (default: 32) --learning_rate LEARNING_RATE Learning rate (default: 0.0001) --number_filters NUMBER_FILTERS Number of filters (default: 32) --dropout DROPOUT Dropout (default: 0.1) ``` ### Prediction ``` python3 src/py/predict_seg.py --help ``` input: 2D slices form CBCT scans output: 2D slices predicted by the model Takes 2D slices as an input from a CBCT scan and outputs the segmentation predicted for each slices. The output is not binary, it is a probability of each pixel to belong to the segmentation (value between [0;255]). ``` usage: predict_seg.py [-h] --dir_predict DIR_PREDICT [--width WIDTH] [--height HEIGHT] --load_model LOAD_MODEL --out OUT Prediction optional arguments: -h, --help show this help message and exit Input files: --dir_predict DIR_PREDICT Input dir to be predicted (default: None) Predict parameters: --width WIDTH width of the images (default: 512) --height HEIGHT height of the images (default: 512) --load_model LOAD_MODEL Path of the trained model (default: None) Output parameters: --out OUT Output directory (default: None) ``` ### Post-Processing ``` python3 src/py/postprocess.py --help ``` input: directory with .png slices output: directory with reconstructed 3D image Takes an input directory containing .png slices and reconstructs the 3D image. Post-processing is applied to improve the segmentation, depending on the tool_name (RCSeg or MandSeg). The output is saved with the same extension than the original scan. ``` usage: postprocess.py [-h] --dir DIR --original_dir ORIGINAL_DIR [--tool TOOL] [--threshold THRESHOLD] --out OUT [--out_raw OUT_RAW] Post-processing optional arguments: -h, --help show this help message and exit Input files: --dir DIR Input directory with 2D images (default: None) --original_dir ORIGINAL_DIR Input directory with original 3D images (default: None) Parameters: --tool TOOL Name of the tool used (default: MandSeg) --threshold THRESHOLD if -1, the thresold apply is otsu, else it is the value entered (between [0;255]) (default: -1) Output parameters: --out OUT Output directory (default: None) --out_raw OUT_RAW Output directory for raw files (default: None) ``` ### Evaluation of the trained models ``` python3 src/py/metrics.py --help ``` input: either a 3D file and its ground truth, or a directory of 3D files and their ground truths. output: excel file containing the mean evaluation metrics (AUPRC, AUPRC - Baseline, F1-score, F2-score, Accuracy, Sensitivity, Precision) of the input files. Compares the prediction made by the algorithm to the ground truth to evaluate the performances of the trained model. If the output excel file already exist, it adds the results into a new line. ``` usage: metrics.py [-h] (--pred_img PRED_IMG | --pred_dir PRED_DIR) [--pred_raw_img PRED_RAW_IMG | --pred_raw_dir PRED_RAW_DIR] (--groundtruth_img GROUNDTRUTH_IMG | --groundtruth_dir GROUNDTRUTH_DIR) --out OUT [--tool TOOL] [--model_name MODEL_NAME] [--epochs EPOCHS] [--batch_size BATCH_SIZE] [--learning_rate LEARNING_RATE] [--number_filters NUMBER_FILTERS] [--cv_fold CV_FOLD] Evaluation metrics optional arguments: -h, --help show this help message and exit Input files: --pred_img PRED_IMG Input predicted reconstructed 3D image (default: None) --pred_dir PRED_DIR Input directory with predicted reconstructed 3D images (default: None) --pred_raw_img PRED_RAW_IMG Input raw predicted reconstructed 3D image (default: None) --pred_raw_dir PRED_RAW_DIR Input directory with raw predicted reconstructed 3D images (default: None) --groundtruth_img GROUNDTRUTH_IMG Input original 3D images (ground truth) (default: None) --groundtruth_dir GROUNDTRUTH_DIR Input directory with original 3D images (ground truth) (default: None) Output parameters: --out OUT Output filename (default: None) Training parameters: --tool TOOL Name of the tool used (default: MandSeg) --model_name MODEL_NAME name of the model (default: CBCT_seg_model) --epochs EPOCHS name of the model (default: 20) --batch_size BATCH_SIZE batch_size value (default: 16) --learning_rate LEARNING_RATE Learning rate (default: 1e-05) --number_filters NUMBER_FILTERS Number of filters (default: 16) --cv_fold CV_FOLD number of the cross-validation fold (default: 1) ```