# distiller **Repository Path**: mirrors_IntelLabs/distiller ## Basic Information - **Project Name**: distiller - **Description**: Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2022-01-06 - **Last Updated**: 2025-12-07 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README > :warning: **DISCONTINUATION OF PROJECT** - *This project will no longer be maintained by Intel. This project has been identified as having known security escapes. Intel has ceased development and contributions including, but not limited to, maintenance, bug fixes, new releases, or updates, to this project.* **Intel no longer accepts patches to this project.**

Clone the Distiller code repository from github:
```
$ git clone https://github.com/IntelLabs/distiller.git
```
The rest of the documentation that follows, assumes that you have cloned your repository to a directory called ```distiller```.
We recommend using a [Python virtual environment](https://docs.python.org/3/library/venv.html#venv-def), but that of course, is up to you.
There's nothing special about using Distiller in a virtual environment, but we provide some instructions, for completeness.
Before creating the virtual environment, make sure you are located in directory ```distiller```. After creating the environment, you should see a directory called ```distiller/env```.
#### Using virtualenv
If you don't have virtualenv installed, you can find the installation instructions [here](https://packaging.python.org/guides/installing-using-pip-and-virtualenv/).
To create the environment, execute:
```
$ python3 -m virtualenv env
```
This creates a subdirectory named ```env``` where the python virtual environment is stored, and configures the current shell to use it as the default python environment.
#### Using venv
If you prefer to use ```venv```, then begin by installing it:
```
$ sudo apt-get install python3-venv
```
Then create the environment:
```
$ python3 -m venv env
```
As with virtualenv, this creates a directory called ```distiller/env```.
#### Activate the environment
The environment activation and deactivation commands for ```venv``` and ```virtualenv``` are the same.
**!NOTE: Make sure to activate the environment, before proceeding with the installation of the dependency packages:
**
```
$ source env/bin/activate
```
Finally, install the Distiller package and its dependencies using ```pip3```: ``` $ cd distiller $ pip3 install -e . ``` This installs Distiller in "development mode", meaning any changes made in the code are reflected in the environment without re-running the install command (so no need to re-install after pulling changes from the Git repository). Notes: - Distiller has only been tested on Ubuntu 16.04 LTS, and with Python 3.5. - If you are not using a GPU, you might need to make small adjustments to the code.
The following will invoke training-only (no compression) of a network named 'simplenet' on the CIFAR10 dataset. This is roughly based on TorchVision's sample Imagenet training application, so it should look familiar if you've used that application. In this example we don't invoke any compression mechanisms: we just train because for fine-tuning after pruning, training is an essential part.
Note that the first time you execute this command, the CIFAR10 code will be downloaded to your machine, which may take a bit of time - please let the download process proceed to completion.
The path to the CIFAR10 dataset is arbitrary, but in our examples we place the datasets in the same directory level as distiller (i.e. ```../../../data.cifar10```).
First, change to the sample directory, then invoke the application:
```
$ cd distiller/examples/classifier_compression
$ python3 compress_classifier.py --arch simplenet_cifar ../../../data.cifar10 -p 30 -j=1 --lr=0.01
```
You can use a TensorBoard backend to view the training progress (in the diagram below we show a couple of training sessions with different LR values). For compression sessions, we've added tracing of activation and parameter sparsity levels, and regularization loss.

We've included in the git repository a few checkpoints of a ResNet20 model that we've trained with 32-bit floats. Let's load the checkpoint of a model that we've trained with channel-wise Group Lasso regularization.
With the following command-line arguments, the sample application loads the model (```--resume```) and prints statistics about the model weights (```--summary=sparsity```). This is useful if you want to load a previously pruned model, to examine the weights sparsity statistics, for example. Note that when you *resume* a stored checkpoint, you still need to tell the application which network architecture the checkpoint uses (```-a=resnet20_cifar```):
```
$ python3 compress_classifier.py --resume=../ssl/checkpoints/checkpoint_trained_ch_regularized_dense.pth.tar -a=resnet20_cifar ../../../data.cifar10 --summary=sparsity
```



This example performs 8-bit quantization of ResNet20 for CIFAR10. We've included in the git repository the checkpoint of a ResNet20 model that we've trained with 32-bit floats, so we'll take this model and quantize it: ``` $ python3 compress_classifier.py -a resnet20_cifar ../../../data.cifar10 --resume ../ssl/checkpoints/checkpoint_trained_dense.pth.tar --quantize-eval --evaluate ``` The command-line above will save a checkpoint named `quantized_checkpoint.pth.tar` containing the quantized model parameters. See more examples [here](https://github.com/IntelLabs/distiller/blob/master/examples/quantization/post_train_quant/command_line.md).

- [DeGirum Pruned Models](https://github.com/DeGirum/pruned-models) - a repository containing pruned models and related information. - [TorchFI](https://github.com/bfgoldstein/torchfi) - TorchFI is a fault injection framework build on top of PyTorch for research purposes. - [hsi-toolbox](https://github.com/daniel-rychlewski/hsi-toolbox) - Hyperspectral CNN compression and band selection
- Brunno F. Goldstein, Sudarshan Srinivasan, Dipankar Das, Kunal Banerjee, Leandro Santiago, Victor C. Ferreira, Alexandre S. Nery, Sandip Kundu, Felipe M. G. Franca.
*[Reliability Evaluation of Compressed Deep Learning Models](https://ieeexplore.ieee.org/document/9069026)*,
In IEEE 11th Latin American Symposium on Circuits & Systems (LASCAS), San Jose, Costa Rica, 2020, pp. 1-5.
- Pascal Bacchus, Robert Stewart, Ekaterina Komendantskaya.
*[Accuracy, Training Time and Hardware Efficiency Trade-Offs for Quantized Neural Networks on FPGAs](https://link.springer.com/chapter/10.1007/978-3-030-44534-8_10)*,
In Applied Reconfigurable Computing. Architectures, Tools, and Applications. ARC 2020. Lecture Notes in Computer Science, vol 12083. Springer, Cham
- Indranil Chakraborty, Mustafa Fayez Ali, Dong Eun Kim, Aayush Ankit, Kaushik Roy.
*[GENIEx: A Generalized Approach to Emulating Non-Ideality in Memristive Xbars using Neural Networks](https://arxiv.org/abs/2003.06902)*,
arXiv:2003.06902, 2020.
- Ahmed T. Elthakeb, Prannoy Pilligundla, Fatemehsadat Mireshghallah, Tarek Elgindi, Charles-Alban Deledalle, Hadi Esmaeilzadeh.
*[Gradient-Based Deep Quantization of Neural Networks through Sinusoidal
Adaptive Regularization](https://arxiv.org/abs/2003.00146)*,
arXiv:2003.00146, 2020.
- Ziqing Yang, Yiming Cui, Zhipeng Chen, Wanxiang Che, Ting Liu, Shijin Wang, Guoping Hu.
*[TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing](https://arxiv.org/abs/2002.12620)*,
arXiv:2002.12620, 2020.
- Alexander Kozlov, Ivan Lazarevich, Vasily Shamporov, Nikolay Lyalyushkin, Yury Gorbachev.
*[Neural Network Compression Framework for fast model inference](https://arxiv.org/abs/2002.08679)*,
arXiv:2002.08679, 2020.
- Moran Shkolnik, Brian Chmiel, Ron Banner, Gil Shomron, Yuri Nahshan, Alex Bronstein, Uri Weiser.
*[Robust Quantization: One Model to Rule Them All](https://arxiv.org/abs/2002.07686)*,
arXiv:2002.07686, 2020.
- Muhammad Abdullah Hanif, Muhammad Shafique.
*[SalvageDNN: salvaging deep neural network accelerators with permanent faults through saliency-driven fault-aware mapping](https://royalsocietypublishing.org/doi/10.1098/rsta.2019.0164)*,
In Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering SciencesVolume 378, Issue 2164, 2019.
https://doi.org/10.1098/rsta.2019.0164
- Meiqi Wang, Jianqiao Mo, Jun Lin, Zhongfeng Wang, Li Du.
*[DynExit: A Dynamic Early-Exit Strategy for Deep Residual Networks](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9020551)*,
In IEEE International Workshop on Signal Processing Systems (SiPS), 2019.
- Vinu Joseph, Saurav Muralidharan, Animesh Garg, Michael Garland, Ganesh Gopalakrishnan.
*[A Programmable Approach to Model Compression](https://arxiv.org/abs/1911.02497),*
arXiv:1911.02497, 2019
[code](https://github.com/NVlabs/condensa)
- Hui Guan, Lin Ning, Zhen Lin, Xipeng Shen, Huiyang Zhou, Seung-Hwan Lim.
*[In-Place Zero-Space Memory Protection for CNN](https://arxiv.org/abs/1910.14479)*,
In Conference on Neural Information Processing Systems (NeurIPS), 2019.
arXiv:1910.14479, 2019
[code](https://github.com/guanh01/wot)
- Hossein Baktash, Emanuele Natale, Laurent Viennot.
*[A Comparative Study of Neural Network Compression](https://arxiv.org/abs/1910.11144)*,
arXiv:1910.11144, 2019.
- Maxim Zemlyanikin, Alexander Smorkalov, Tatiana Khanova, Anna Petrovicheva, Grigory Serebryakov.
*[512KiB RAM Is Enough! Live Camera Face Recognition DNN on MCU](http://openaccess.thecvf.com/content_ICCVW_2019/html/LPCV/Zemlyanikin_512KiB_RAM_Is_Enough_Live_Camera_Face_Recognition_DNN_on_ICCVW_2019_paper.html)*,
In IEEE International Conference on Computer Vision (ICCV), 2019.
- Ziheng Wang, Jeremy Wohlwend, Tao Lei.
*[Structured Pruning of Large Language Models](https://arxiv.org/abs/1910.04732)*,
arXiv:1910.04732, 2019.
- Soroush Ghodrati, Hardik Sharma, Sean Kinzer, Amir Yazdanbakhsh, Kambiz Samadi, Nam Sung Kim, Doug Burger, Hadi Esmaeilzadeh.
*[Mixed-Signal Charge-Domain Acceleration of Deep Neural networks through Interleaved Bit-Partitioned Arithmetic](https://arxiv.org/abs/1906.11915)*,
arXiv:1906.11915, 2019.
- Gil Shomron, Tal Horowitz, Uri Weiser.
*[SMT-SA: Simultaneous Multithreading in Systolic Arrays](https://ieeexplore.ieee.org/document/8742541)*,
In IEEE Computer Architecture Letters (CAL), 2019.
- Shangqian Gao , Cheng Deng , and Heng Huang.
*[Cross Domain Model Compression by Structurally Weight Sharing](http://openaccess.thecvf.com/content_CVPR_2019/html/Gao_Cross_Domain_Model_Compression_by_Structurally_Weight_Sharing_CVPR_2019_paper.html),*
In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 8973-8982.
- Moin Nadeem, Wei Fang, Brian Xu, Mitra Mohtarami, James Glass.
*[FAKTA: An Automatic End-to-End Fact Checking System](https://arxiv.org/abs/1906.04164),*
In North American Chapter of the Association for Computational Linguistics (NAACL), 2019.
- Ahmed T. Elthakeb, Prannoy Pilligundla, Hadi Esmaeilzadeh.
*[SinReQ: Generalized Sinusoidal Regularization for Low-Bitwidth Deep Quantized Training](https://arxiv.org/abs/1905.01416),*
arXiv:1905.01416, 2019.
[code](https://github.com/sinreq/sinreq_code)
- Goncharenko A., Denisov A., Alyamkin S., Terentev E.
*[Trainable Thresholds for Neural Network Quantization](https://rd.springer.com/chapter/10.1007/978-3-030-20518-8_26),*
In: Rojas I., Joya G., Catala A. (eds) Advances in Computational Intelligence Lecture Notes in Computer Science, vol 11507. Springer, Cham. International Work-Conference on Artificial Neural Networks (IWANN 2019).
- Ahmed T. Elthakeb, Prannoy Pilligundla, Hadi Esmaeilzadeh.
*[Divide and Conquer: Leveraging Intermediate Feature Representations for Quantized Training of Neural Networks](https://arxiv.org/abs/1906.06033),*
arXiv:1906.06033, 2019
- Ritchie Zhao, Yuwei Hu, Jordan Dotzel, Christopher De Sa, Zhiru Zhang.
*[Improving Neural Network Quantization without Retraining using Outlier Channel Splitting](https://arxiv.org/abs/1901.09504),*
arXiv:1901.09504, 2019
[Code](https://github.com/cornell-zhang/dnn-quant-ocs)
- Angad S. Rekhi, Brian Zimmer, Nikola Nedovic, Ningxi Liu, Rangharajan Venkatesan, Miaorong Wang, Brucek Khailany, William J. Dally, C. Thomas Gray.
*[Analog/Mixed-Signal Hardware Error Modeling for Deep Learning Inference](https://research.nvidia.com/sites/default/files/pubs/2019-06_Analog/Mixed-Signal-Hardware-Error/40_2_Rekhi_AMS_ML.pdf)*,
Nvidia Research, 2019.
- Norio Nakata.
*[Recent Technical Development of Artificial Intelligence for Diagnostic Medical Imaging]( https://rd.springer.com/article/10.1007/s11604-018-0804-6)*,
In Japanese Journal of Radiology, February 2019, Volume 37, Issue 2, pp 103–108.
- Alexander Goncharenko, Andrey Denisov, Sergey Alyamkin, Evgeny Terentev.
*[Fast Adjustable Threshold For Uniform Neural Network Quantization](https://arxiv.org/abs/1812.07872)*,
arXiv:1812.07872, 2018