BERT Pretraining

Model description

BERT, or Bidirectional Encoder Representations from Transformers, improves upon standard Transformers by removing the unidirectionality constraint by using a masked language model (MLM) pre-training objective. The masked language model randomly masks some of the tokens from the input, and the objective is to predict the original vocabulary id of the masked word based only on its context. Unlike left-to-right language model pre-training, the MLM objective enables the representation to fuse the left and the right context, which allows us to pre-train a deep bidirectional Transformer. In addition to the masked language model, BERT uses a next sentence prediction task that jointly pre-trains text-pair representations.

Prepare

Install packages

bash init_tf.sh

Download datasets

This Google Drive location contains the following.
You need to download tf1_ckpt folde , vocab.txt and bert_config.json into one file named bert_pretrain_ckpt_tf

bert_pretrain_ckpt_tf: contains checkpoint files
    model.ckpt-28252.data-00000-of-00001
    model.ckpt-28252.index
    model.ckpt-28252.meta
    vocab.txt
    bert_config.json

Download and preprocess datasets You need to make a file named bert_pretrain_tf_records and store the results above. tips: you can git clone this repo in other place ,we need the bert_pretrain_tf_records results here.

Training

Training on single card

bash run_1card_FPS.sh

Training on mutil-cards

bash run_multi_card_FPS.sh

Result

	acc	fps
multi_card	0.424126	0.267241

DeepSpark/DeepSparkHub

BERT Pretraining

Model description

Prepare

Install packages

Download datasets

Training

Training on single card

Training on mutil-cards

Result

简介

发行版 (7)

贡献者

近期动态

DeepSpark/DeepSparkHub .gitee-modal { width: 500px !important; }

BERT Pretraining

Model description

Prepare

Install packages

Download datasets

Training

Training on single card

Training on mutil-cards

Result

简介

发行版 (7)

开源评估指数源自 OSS-Compass 评估体系，评估体系围绕以下三个维度对项目展开评估：

贡献者

近期动态

搜索帮助

DeepSpark/DeepSparkHub