# PSPNet-Keras-tensorflow **Repository Path**: xxuffei/PSPNet-Keras-tensorflow ## Basic Information - **Project Name**: PSPNet-Keras-tensorflow - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-10-02 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Keras implementation of [PSPNet(caffe)](https://github.com/hszhao/PSPNet) Implemented Architecture of Pyramid Scene Parsing Network in Keras. For the best compability please use Python3.5 ### Setup 1. Install dependencies: * Tensorflow (-gpu) * Keras * numpy * scipy * pycaffe(PSPNet)(optional for converting the weights) ```bash pip install -r requirements.txt --upgrade ``` 2. Converted trained weights are needed to run the network. Weights(in ```.h5 .json``` format) have to be downloaded and placed into directory ``` weights/keras ``` Already converted weights can be downloaded here: * [pspnet50_ade20k.h5](https://www.dropbox.com/s/0uxn14y26jcui4v/pspnet50_ade20k.h5?dl=1) [pspnet50_ade20k.json](https://www.dropbox.com/s/v41lvku2lx7lh6m/pspnet50_ade20k.json?dl=1) * [pspnet101_cityscapes.h5](https://www.dropbox.com/s/c17g94n946tpalb/pspnet101_cityscapes.h5?dl=1) [pspnet101_cityscapes.json](https://www.dropbox.com/s/fswowe8e3o14tdm/pspnet101_cityscapes.json?dl=1) * [pspnet101_voc2012.h5](https://www.dropbox.com/s/uvqj2cjo4b9c5wg/pspnet101_voc2012.h5?dl=1) [pspnet101_voc2012.json](https://www.dropbox.com/s/rr5taqu19f5fuzy/pspnet101_voc2012.json?dl=1) ## Convert weights by yourself(optional) (Note: this is **not** required if you use .h5/.json weights) Running this needs the compiled original PSPNet caffe code and pycaffe. ```bash python weight_converter.py ``` ## Usage: ```bash python pspnet.py -m -i -o python pspnet.py -m pspnet101_cityscapes -i example_images/cityscapes.png -o example_results/cityscapes.jpg python pspnet.py -m pspnet101_voc2012 -i example_images/pascal_voc.jpg -o example_results/pascal_voc.jpg ``` List of arguments: ```bash -m --model - which model to use: 'pspnet50_ade20k', 'pspnet101_cityscapes', 'pspnet101_voc2012' --id - (int) GPU Device id. Default 0 -s --sliding - Use sliding window -f --flip - Additional prediction of flipped image -ms --multi_scale - Predict on multiscale images ``` ## Keras results: ![Original](example_images/ade20k.jpg) ![New](example_results/ade20k_seg.jpg) ![New](example_results/ade20k_seg_blended.jpg) ![New](example_results/ade20k_probs.jpg) ![Original](example_images/cityscapes.png) ![New](example_results/cityscapes_seg.jpg) ![New](example_results/cityscapes_seg_blended.jpg) ![New](example_results/cityscapes_probs.jpg) ![Original](example_images/pascal_voc.jpg) ![New](example_results/pascal_voc_seg.jpg) ![New](example_results/pascal_voc_seg_blended.jpg) ![New](example_results/pascal_voc_probs.jpg) ## Implementation details * The interpolation layer is implemented as custom layer "Interp" * Forward step takes about ~1 sec on single image * Memory usage can be optimized with: ```python config = tf.ConfigProto() config.gpu_options.per_process_gpu_memory_fraction = 0.3 sess = tf.Session(config=config) ``` * ```ndimage.zoom``` can take a long time