18 Star 77 Fork 49

DeepSpark / DeepSparkHub

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
README.md 1.91 KB
一键复制 编辑 原始数据 按行查看 历史
majorli6 提交于 2023-03-21 15:35 . make model name in README.md the same.

BERT Pretraining

Model description

BERT, or Bidirectional Encoder Representations from Transformers, improves upon standard Transformers by removing the unidirectionality constraint by using a masked language model (MLM) pre-training objective. The masked language model randomly masks some of the tokens from the input, and the objective is to predict the original vocabulary id of the masked word based only on its context. Unlike left-to-right language model pre-training, the MLM objective enables the representation to fuse the left and the right context, which allows us to pre-train a deep bidirectional Transformer. In addition to the masked language model, BERT uses a next sentence prediction task that jointly pre-trains text-pair representations.

Prepare

Install packages

bash init_tf.sh

Download datasets

This Google Drive location contains the following.
You need to download tf1_ckpt folde , vocab.txt and bert_config.json into one file named bert_pretrain_ckpt_tf

bert_pretrain_ckpt_tf: contains checkpoint files
    model.ckpt-28252.data-00000-of-00001
    model.ckpt-28252.index
    model.ckpt-28252.meta
    vocab.txt
    bert_config.json

Download and preprocess datasets You need to make a file named bert_pretrain_tf_records and store the results above. tips: you can git clone this repo in other place ,we need the bert_pretrain_tf_records results here.

Training

Training on single card

bash run_1card_FPS.sh

Training on mutil-cards

bash run_multi_card_FPS.sh 

Result

acc fps
multi_card 0.424126 0.267241
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
Python
1
https://gitee.com/deep-spark/deepsparkhub.git
git@gitee.com:deep-spark/deepsparkhub.git
deep-spark
deepsparkhub
DeepSparkHub
master

搜索帮助

344bd9b3 5694891 D2dac590 5694891