# lidc_nodule_detection **Repository Path**: JCXY/lidc_nodule_detection ## Basic Information - **Project Name**: lidc_nodule_detection - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2018-02-28 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Introduction This is a simple framework for training neural networks to detect nodules in CT images. Training requires a json file (e.g. [here](https://github.com/zhwhong/lidc_nodule_detection/blob/master/CNN_LSTM/hypes/lstm_rezoom_lung.json)) containing a list of CT images and the bounding boxes in each image. The model combines both CNN model and LSTM unit. The algorithm here is mainly refered to paper [***End-to-end people detection in crowded scenes***](https://arxiv.org/abs/1506.04878). ![](images/detect.png) The deep learning framewoek is based on [TensorFlow](https://github.com/tensorflow/tensorflow)(version 1.0.0) and some coding ideas are forked from the [TensorBox](https://github.com/TensorBox/TensorBox) project. Here I show heartfelt gratefulness. About nodule classfication method based on CNN transfer learning, you can refer to [this paper](https://arxiv.org/abs/1602.03409). # Dicom and XML parsing Parsing the lidc XML annotation and dicom files,see [pylung](https://github.com/zhwhong/lidc_nodule_detection/blob/master/pylung) and [this blog](http://zhwhong.ml/2017/03/27/LIDC-Dicom-data-and-XML-annotation-parse/). ```python >>> import dicom >>> f = dicom.read_file('000001.dcm') >>> print f (0008, 0005) Specific Character Set CS: 'ISO_IR 100' (0008, 0008) Image Type CS: ['ORIGINAL', 'PRIMARY', 'AXIAL'] (0008, 0016) SOP Class UID UI: CT Image Storage (0008, 0018) SOP Instance UID UI: 1.3.6.1.4.1.14519.5.2.1.6279.6001.143451261327128179989900675595 (0008, 0020) Study Date DA: '20000101' (0008, 0021) Series Date DA: '20000101' (0008, 0022) Acquisition Date DA: '20000101' (0008, 0023) Content Date DA: '20000101' (0008, 0024) Overlay Date DA: '20000101' (0008, 0025) Curve Date DA: '20000101' (0008, 002a) Acquisition DateTime DT: '20000101' (0008, 0030) Study Time TM: '' (0008, 0032) Acquisition Time TM: '' (0008, 0033) Content Time TM: '' (0008, 0050) Accession Number SH: '2819497684894126' (0008, 0060) Modality CS: 'CT' (0008, 0070) Manufacturer LO: 'GE MEDICAL SYSTEMS' (0008, 0090) Referring Physician Name PN: '' (0008, 1090) Manufacturer Model Name LO: 'LightSpeed Plus' (0008, 1155) Referenced SOP Instance UID UI: 1.3.6.1.4.1.14519.5.2.1.6279.6001.675906998158803995297223798692 (0010, 0010) Patient Name PN: '' (0010, 0020) Patient ID LO: 'LIDC-IDRI-0001' (0010, 0030) Patient Birth Date DA: '' (0010, 0040) Patient Sex CS: '' (0010, 1010) Patient Age AS: '' (0010, 21d0) Last Menstrual Date DA: '20000101' (0012, 0062) Patient Identity Removed CS: 'YES' (0012, 0063) De-identification Method LO: 'DCM:113100/113105/113107/113108/113109/113111' (0013, 0010) Private Creator LO: 'CTP' (0013, 1010) Private tag data LO: 'LIDC-IDRI' (0013, 1013) Private tag data LO: '62796001' (0018, 0010) Contrast/Bolus Agent LO: 'IV' (0018, 0015) Body Part Examined CS: 'CHEST' (0018, 0022) Scan Options CS: 'HELICAL MODE' (0018, 0050) Slice Thickness DS: '2.500000' (0018, 0060) KVP DS: '120' (0018, 0090) Data Collection Diameter DS: '500.000000' (0018, 1020) Software Version(s) LO: 'LightSpeedApps2.4.2_H2.4M5' (0018, 1100) Reconstruction Diameter DS: '360.000000' (0018, 1110) Distance Source to Detector DS: '949.075012' (0018, 1111) Distance Source to Patient DS: '541.000000' (0018, 1120) Gantry/Detector Tilt DS: '0.000000' (0018, 1130) Table Height DS: '144.399994' (0018, 1140) Rotation Direction CS: 'CW' (0018, 1150) Exposure Time IS: '570' (0018, 1151) X-Ray Tube Current IS: '400' (0018, 1152) Exposure IS: '4684' (0018, 1160) Filter Type SH: 'BODY FILTER' (0018, 1170) Generator Power IS: '48000' (0018, 1190) Focal Spot(s) DS: '1.200000' (0018, 1210) Convolution Kernel SH: 'STANDARD' (0018, 5100) Patient Position CS: 'FFS' (0020, 000d) Study Instance UID UI: 1.3.6.1.4.1.14519.5.2.1.6279.6001.298806137288633453246975630178 (0020, 000e) Series Instance UID UI: 1.3.6.1.4.1.14519.5.2.1.6279.6001.179049373636438705059720603192 (0020, 0010) Study ID SH: '' (0020, 0011) Series Number IS: '3000566' (0020, 0013) Instance Number IS: '80' (0020, 0032) Image Position (Patient) DS: ['-166.000000', '-171.699997', '-207.500000'] (0020, 0037) Image Orientation (Patient) DS: ['1.000000', '0.000000', '0.000000', '0.000000', '1.000000', '0.000000'] (0020, 0052) Frame of Reference UID UI: 1.3.6.1.4.1.14519.5.2.1.6279.6001.229925374658226729607867499499 (0020, 1040) Position Reference Indicator LO: 'SN' (0020, 1041) Slice Location DS: '-207.500000' (0028, 0002) Samples per Pixel US: 1 (0028, 0004) Photometric Interpretation CS: 'MONOCHROME2' (0028, 0010) Rows US: 512 (0028, 0011) Columns US: 512 (0028, 0030) Pixel Spacing DS: ['0.703125', '0.703125'] (0028, 0100) Bits Allocated US: 16 (0028, 0101) Bits Stored US: 16 (0028, 0102) High Bit US: 15 (0028, 0103) Pixel Representation US: 1 (0028, 0120) Pixel Padding Value US: 63536 (0028, 0303) Longitudinal Temporal Information M CS: 'MODIFIED' (0028, 1050) Window Center DS: '-600' (0028, 1051) Window Width DS: '1600' (0028, 1052) Rescale Intercept DS: '-1024' (0028, 1053) Rescale Slope DS: '1' (0038, 0020) Admitting Date DA: '20000101' (0040, 0002) Scheduled Procedure Step Start Date DA: '20000101' (0040, 0004) Scheduled Procedure Step End Date DA: '20000101' (0040, 0244) Performed Procedure Step Start Date DA: '20000101' (0040, 2016) Placer Order Number / Imaging Servi LO: '' (0040, 2017) Filler Order Number / Imaging Servi LO: '' (0040, a075) Verifying Observer Name PN: 'Removed by CTP' (0040, a123) Person Name PN: 'Removed by CTP' (0040, a124) UID UI: 1.3.6.1.4.1.14519.5.2.1.6279.6001.335419887712224178340067932923 (0070, 0084) Content Creator's Name PN: '' (0088, 0140) Storage Media File-set UID UI: 1.3.6.1.4.1.14519.5.2.1.6279.6001.211790042620307056609660772296 (7fe0, 0010) Pixel Data OW: Array of 524288 bytes ``` ![](images/parse_xml.png) # Training First, [install TensorFlow from source or pip](https://www.tensorflow.org/versions/r0.11/get_started/os_setup#pip-installation) (NB: source installs currently break threading on 0.11). Then run the training script `CNN_LSTM/run.sh`.Note that running on your own dataset should require modifying the `CNN_LSTM/hypes/\*.json` file. ![confidence loss](images/loss1.png) ![regression loss](images/loss2.png) ![accuracy](images/accuracy.png)

# Evaluation There are two options for evaluation, an [ipython notebook](https://github.com/zhwhong/lidc_nodule_detection/blob/master/CNN_LSTM/evaluate.ipynb) and a [python script](https://github.com/zhwhong/lidc_nodule_detection/blob/master/CNN_LSTM/evaluate.py). - some test results ![](images/result_example.png) - precision-recall curve

# Other - Anatomy map

- Parenchyma extraction ![](images/parenchyma.png) - Pulmonary nodules form ![](images/nodule.png) # Reference - [Dataset] [The Lung Image Database Consortium image collection(LIDC-IDRI)](https://wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI) - [Paper] [***End-to-end people detection in crowded scenes***](https://arxiv.org/abs/1506.04878) - [Paper] [***Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning***](https://arxiv.org/abs/1602.03409) - [Paper] [***An overview of classification algorithms for imbalanced datasets***](http://www.ijetae.com/files/Volume2Issue4/IJETAE_0412_07.pdf) - [Blog] [LIDC-IDRI肺结节公开数据集Dicom和XML标注详解](http://zhwhong.ml/2017/03/27/LIDC-Dicom-data-and-XML-annotation-parse/) - [Blog] [机器学习之分类性能度量指标 : ROC曲线、AUC值、正确率、召回率](http://zhwhong.ml/2017/04/14/ROC-AUC-Precision-Recall-analysis/) - [简书] [LIDC-IDRI肺结节Dicom数据集解析与总结](http://www.jianshu.com/p/9c1facf70b01) - [简书] [LIDC-IDRI肺结节公开数据集Dicom和XML标注详解](http://www.jianshu.com/p/c4e9e18195eb) - [简书] [医疗CT影像肺结节检测参考项目(附论文)](http://www.jianshu.com/p/14df9c48453a) - [简书] [如何应用Python处理医学影像学中的DICOM信息](http://www.jianshu.com/p/df64088e9b6b) - [简书] [CT图像肺结节识别算法调研 — CNN篇](http://www.jianshu.com/p/e7dbad9e48ff) - [简书] [吕乐:面向医学图像计算的深度学习与卷积神经网络(转)](http://www.jianshu.com/p/d29223ee2cb2)