18 Star 82 Fork 49

DeepSpark/DeepSparkHub

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
README.md 2.35 KB
一键复制 编辑 原始数据 按行查看 历史
jeff guo 提交于 2022-09-30 15:06 . Initial Commit

BERT Pretraining

Model description

BERT, or Bidirectional Encoder Representations from Transformers, improves upon standard Transformers by removing the unidirectionality constraint by using a masked language model (MLM) pre-training objective. The masked language model randomly masks some of the tokens from the input, and the objective is to predict the original vocabulary id of the masked word based only on its context. Unlike left-to-right language model pre-training, the MLM objective enables the representation to fuse the left and the right context, which allows us to pre-train a deep bidirectional Transformer. In addition to the masked language model, BERT uses a next sentence prediction task that jointly pre-trains text-pair representations.

Step 1: Installing

bash init.sh

Step 2: Preparing dataset

Reference: training_results_v1.0

Structure

└── bert/dataset
    ├── 2048_shards_uncompressed
    ├── bert_config.json
    ├── eval_set_uncompressed
    └── model.ckpt-28252.apex.pt

Step 3: Training

Warning: The number of cards are computed by torch.cuda.device_count(), so you can set CUDA_VISIBLE_DEVICES to set the number of cards.

Multiple GPUs on one machine (AMP)

DATA=/path/to/bert/dataset bash train_bert_pretraining_amp_dist.sh

Parameters

--gradient_accumulation_steps
--max_steps
--train_batch_size
--eval_batch_size 
--learning_rate
--target_mlm_accuracy
--dist_backend

Results on BI-V100

GPUs FP16 FPS E2E MLM Accuracy
1x8 True 227 13568s 0.72
Convergence criteria Configuration (x denotes number of GPUs) Performance Accuracy Power(W) Scalability Memory utilization(G) Stability
0.72 SDK V2.2,bs:32,8x,AMP 214 0.72 152*8 0.96 20.3*8 1

Reference

https://github.com/mlcommons/training_results_v1.0/tree/master/NVIDIA/benchmarks/bert/implementations/pytorch

马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
Python
1
https://gitee.com/deep-spark/deepsparkhub.git
git@gitee.com:deep-spark/deepsparkhub.git
deep-spark
deepsparkhub
DeepSparkHub
master

搜索帮助

A270a887 8829481 3d7a4017 8829481