335 Star 1.5K Fork 861

MindSpore / docs

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
benchmark.md 3.98 KB
一键复制 编辑 原始数据 按行查看 历史
宦晓玲 提交于 2023-07-21 16:52 . modify the md links 1.8

Benchmarks

View Source On Gitee

This document describes the MindSpore benchmarks. For details about the MindSpore networks, see ModelZoo.

Training Performance

ResNet

Network Network Type Dataset MindSpore Version Resource                 Precision Batch Size Throughput Speedup
ResNet-50 v1.5 CNN ImageNet2012 0.5.0-beta Ascend: 1 * Ascend 910 CPU: 24 Cores Mixed 256 2115 images/sec -
Ascend: 8 * Ascend 910 CPU: 192 Cores Mixed 256 16600 images/sec 0.98
Ascend: 16 * Ascend 910 CPU: 384 Cores Mixed 256 32768 images/sec 0.96
  1. The preceding performance is obtained based on ModelArts, the HUAWEI CLOUD AI development platform. It is the average performance obtained by the Ascend 910 AI processor during the overall training process.
  2. For details about other open source frameworks, see ResNet-50 v1.5 for TensorFlow.

BERT

Network Network Type Dataset MindSpore Version Resource                 Precision Batch Size Throughput Speedup
BERT-Large Attention zhwiki 0.5.0-beta Ascend: 1 * Ascend 910 CPU: 24 Cores Mixed 96 269 sentences/sec -
Ascend: 8 * Ascend 910 CPU: 192 Cores Mixed 96 2069 sentences/sec 0.96
  1. The preceding performance is obtained based on ModelArts, the HUAWEI CLOUD AI development platform. The network contains 24 hidden layers, the sequence length is 128 tokens, and the vocabulary contains 21128 tokens.
  2. For details about other open source frameworks, see BERT For TensorFlow.

Wide & Deep (data parallel)

Network Network Type Dataset MindSpore Version Resource                 Precision Batch Size Throughput Speedup
Wide & Deep Recommend Criteo 0.6.0-beta Ascend: 1 * Ascend 910 CPU: 24 Cores Mixed 16000 796892 samples/sec -
Ascend: 8 * Ascend 910 CPU: 192 Cores Mixed 16000*8 4872849 samples/sec 0.76
  1. The preceding performance is obtained based on Atlas 800, and the model is data parallel.
  2. For details about other open source frameworks, see Wide & Deep For TensorFlow.

Wide & Deep (Host-Device model parallel)

Network Network Type Dataset MindSpore Version Resource                 Precision Batch Size Throughput Speedup
Wide & Deep Recommend Criteo 0.6.0-beta Ascend: 1 * Ascend 910 CPU: 24 Cores Mixed 1000 68715 samples/sec -
Ascend: 8 * Ascend 910 CPU: 192 Cores Mixed 8000*8 283830 samples/sec 0.51
Ascend: 16 * Ascend 910 CPU: 384 Cores Mixed 8000*16 377848 samples/sec 0.34
Ascend: 32 * Ascend 910 CPU: 768 Cores Mixed 8000*32 433423 samples/sec 0.20
  1. The preceding performance is obtained based on Atlas 800, and the model is model parallel.
  2. For details about other open source frameworks, see Wide & Deep For TensorFlow.
1
https://gitee.com/mindspore/docs.git
git@gitee.com:mindspore/docs.git
mindspore
docs
docs
r1.8

搜索帮助

53164aa7 5694891 3bd8fe86 5694891