# TinyBERT_General_4L_312D

**Repository Path**: hf-models/TinyBERT_General_4L_312D

## Basic Information

- **Project Name**: TinyBERT_General_4L_312D
- **Description**: TinyBERT_General_4L_312D是一个小型的通用TinyBERT模型，适用于各种文本处理任务。
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 1
- **Created**: 2023-10-25
- **Last Updated**: 2024-09-10

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

TinyBERT: Distilling BERT for Natural Language Understanding
======== 
TinyBERT is 7.5x smaller and 9.4x faster on inference than BERT-base and achieves competitive performances in the tasks of natural language understanding. It performs a novel transformer distillation at both the pre-training and task-specific learning stages. In general distillation, we use the original BERT-base without fine-tuning as the teacher and a large-scale text corpus as the learning data. By performing the Transformer distillation on the text from general domain, we obtain a general TinyBERT which provides a good initialization for the task-specific distillation. We here provide the general TinyBERT for your tasks at hand.

For more details about the techniques of TinyBERT, refer to our paper:
[TinyBERT: Distilling BERT for Natural Language Understanding](https://arxiv.org/abs/1909.10351)


Citation
========
If you find TinyBERT useful in your research, please cite the following paper:
```
@article{jiao2019tinybert,
  title={Tinybert: Distilling bert for natural language understanding},
  author={Jiao, Xiaoqi and Yin, Yichun and Shang, Lifeng and Jiang, Xin and Chen, Xiao and Li, Linlin and Wang, Fang and Liu, Qun},
  journal={arXiv preprint arXiv:1909.10351},
  year={2019}
}
```