# knowledge-distillation

**Repository Path**: haojiepan/knowledge-distillation

## Basic Information

- **Project Name**: knowledge-distillation
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 1
- **Created**: 2020-05-06
- **Last Updated**: 2020-12-19

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Knowledge Distillation

PyTorch implementations of algorithms for knowledge distillation.

## Setup

### build

```bash
$ docker build -t kd -f Dockerfile .
```

### run

```bash
$ docker run -v local_data_path:/data -v project_path:/app -p 0.0.0.0:8084:8084 -it kd
```

## Experiments

1. [Task-specific distillation from BERT to BiLSTM](https://github.com/pvgladkov/knowledge-distillation/blob/master/experiments/sst2). Data: SST-2 binary classification.


## Papers

1. Cristian Bucila, Rich Caruana, Alexandru Niculescu-Mizil "**ModelCompression**" (2006) [pdf](https://www.cs.cornell.edu/~caruana/compression.kdd06.pdf).

2. Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf "**DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter**" (2019) https://arxiv.org/abs/1910.01108.

3. Raphael Tang, Yao Lu, Linqing Liu, Lili Mou, Olga Vechtomova, Jimmy Lin "**Distilling Task-Specific Knowledge from BERT into Simple Neural Networks**" (2019) https://arxiv.org/abs/1903.12136.

4. Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut "**ALBERT: A Lite BERT for Self-supervised Learning of Language Representations**" (2019) https://arxiv.org/abs/1909.11942.

5. Rafael Müller, Simon Kornblith, Geoffrey Hinton "**Subclass Distillation**" (2020) https://arxiv.org/abs/2002.03936.