开源项目 > 其他开源 > 图书/手册/教程 && 人工智能 > 机器学习/深度学习 &&

加入 Gitee

与超过 1200万开发者一起发现、参与优秀开源项目，私有仓库也完全免费：）

克隆/下载

benchmark.md 3.98 KB

# Benchmarks

[![View Source On Gitee](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/resource/_static/logo_source_en.png)](https://gitee.com/mindspore/docs/blob/r1.8/docs/mindspore/source_en/note/benchmark.md)

This document describes the MindSpore benchmarks.
For details about the MindSpore networks, see [ModelZoo](https://gitee.com/mindspore/models/blob/r1.8/README.md#).

## Training Performance

### ResNet

| Network | Network Type | Dataset | MindSpore Version | Resource &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | Precision | Batch Size | Throughput | Speedup |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| ResNet-50 v1.5 | CNN | ImageNet2012 | 0.5.0-beta | Ascend: 1 * Ascend 910 </br> CPU: 24 Cores | Mixed | 256 | 2115 images/sec | - |
|  |  |  |  | Ascend: 8 * Ascend 910 </br> CPU: 192 Cores | Mixed | 256 | 16600 images/sec | 0.98 |
|  |  |  |  | Ascend: 16 * Ascend 910 </br> CPU: 384 Cores | Mixed | 256 | 32768 images/sec | 0.96 |

1. The preceding performance is obtained based on ModelArts, the HUAWEI CLOUD AI development platform. It is the average performance obtained by the Ascend 910 AI processor during the overall training process.
2. For details about other open source frameworks, see [ResNet-50 v1.5 for TensorFlow](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Classification/ConvNets/resnet50v1.5).

### BERT

| Network | Network Type | Dataset | MindSpore Version | Resource &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | Precision | Batch Size | Throughput |  Speedup |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| BERT-Large | Attention | zhwiki | 0.5.0-beta | Ascend: 1 * Ascend 910 </br> CPU: 24 Cores | Mixed | 96 | 269 sentences/sec | - |
|  |  |  |  | Ascend: 8 * Ascend 910 </br> CPU: 192 Cores | Mixed | 96 | 2069 sentences/sec | 0.96 |

1. The preceding performance is obtained based on ModelArts, the HUAWEI CLOUD AI development platform. The network contains 24 hidden layers, the sequence length is 128 tokens, and the vocabulary contains 21128 tokens.
2. For details about other open source frameworks, see [BERT For TensorFlow](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/LanguageModeling/BERT).

### Wide & Deep (data parallel)

| Network | Network Type | Dataset | MindSpore Version | Resource &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | Precision | Batch Size | Throughput |  Speedup |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| Wide & Deep | Recommend | Criteo | 0.6.0-beta | Ascend: 1 * Ascend 910 </br> CPU: 24 Cores | Mixed | 16000 | 796892 samples/sec | - |
|  |  |  |  | Ascend: 8 \* Ascend 910 </br> CPU: 192 Cores | Mixed | 16000*8 | 4872849 samples/sec | 0.76 |

1. The preceding performance is obtained based on Atlas 800, and the model is data parallel.
2. For details about other open source frameworks, see [Wide & Deep For TensorFlow](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Recommendation/WideAndDeep).

### Wide & Deep (Host-Device model parallel)

| Network | Network Type | Dataset | MindSpore Version | Resource &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | Precision | Batch Size | Throughput |  Speedup |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| Wide & Deep | Recommend | Criteo | 0.6.0-beta | Ascend: 1 * Ascend 910 </br> CPU: 24 Cores | Mixed | 1000 | 68715 samples/sec | - |
|  |  |  |  | Ascend: 8 \* Ascend 910 </br> CPU: 192 Cores | Mixed | 8000*8 | 283830 samples/sec | 0.51 |
|  |  |  |  | Ascend: 16 \* Ascend 910 </br> CPU: 384 Cores | Mixed | 8000*16 | 377848 samples/sec | 0.34 |
|  |  |  |  | Ascend: 32 \* Ascend 910 </br> CPU: 768 Cores | Mixed | 8000*32 | 433423 samples/sec | 0.20 |

1. The preceding performance is obtained based on Atlas 800, and the model is model parallel.
2. For details about other open source frameworks, see [Wide & Deep For TensorFlow](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Recommendation/WideAndDeep).

一键复制编辑原始数据按行查看历史

提交于 2023-07-21 16:52 . modify the md links 1.8

Benchmarks

This document describes the MindSpore benchmarks. For details about the MindSpore networks, see ModelZoo.

Training Performance

ResNet

Network	Network Type	Dataset	MindSpore Version	Resource	Precision	Batch Size	Throughput	Speedup
ResNet-50 v1.5	CNN	ImageNet2012	0.5.0-beta	Ascend: 1 * Ascend 910 CPU: 24 Cores	Mixed	256	2115 images/sec	-
				Ascend: 8 * Ascend 910 CPU: 192 Cores	Mixed	256	16600 images/sec	0.98
				Ascend: 16 * Ascend 910 CPU: 384 Cores	Mixed	256	32768 images/sec	0.96

The preceding performance is obtained based on ModelArts, the HUAWEI CLOUD AI development platform. It is the average performance obtained by the Ascend 910 AI processor during the overall training process.
For details about other open source frameworks, see ResNet-50 v1.5 for TensorFlow.

BERT

Network	Network Type	Dataset	MindSpore Version	Resource	Precision	Batch Size	Throughput	Speedup
BERT-Large	Attention	zhwiki	0.5.0-beta	Ascend: 1 * Ascend 910 CPU: 24 Cores	Mixed	96	269 sentences/sec	-
				Ascend: 8 * Ascend 910 CPU: 192 Cores	Mixed	96	2069 sentences/sec	0.96

The preceding performance is obtained based on ModelArts, the HUAWEI CLOUD AI development platform. The network contains 24 hidden layers, the sequence length is 128 tokens, and the vocabulary contains 21128 tokens.
For details about other open source frameworks, see BERT For TensorFlow.

Wide & Deep (data parallel)

Network	Network Type	Dataset	MindSpore Version	Resource	Precision	Batch Size	Throughput	Speedup
Wide & Deep	Recommend	Criteo	0.6.0-beta	Ascend: 1 * Ascend 910 CPU: 24 Cores	Mixed	16000	796892 samples/sec	-
				Ascend: 8 * Ascend 910 CPU: 192 Cores	Mixed	16000*8	4872849 samples/sec	0.76

The preceding performance is obtained based on Atlas 800, and the model is data parallel.
For details about other open source frameworks, see Wide & Deep For TensorFlow.

Wide & Deep (Host-Device model parallel)

Network	Network Type	Dataset	MindSpore Version	Resource	Precision	Batch Size	Throughput	Speedup
Wide & Deep	Recommend	Criteo	0.6.0-beta	Ascend: 1 * Ascend 910 CPU: 24 Cores	Mixed	1000	68715 samples/sec	-
				Ascend: 8 * Ascend 910 CPU: 192 Cores	Mixed	8000*8	283830 samples/sec	0.51
				Ascend: 16 * Ascend 910 CPU: 384 Cores	Mixed	8000*16	377848 samples/sec	0.34
				Ascend: 32 * Ascend 910 CPU: 768 Cores	Mixed	8000*32	433423 samples/sec	0.20

The preceding performance is obtained based on Atlas 800, and the model is model parallel.
For details about other open source frameworks, see Wide & Deep For TensorFlow.

1

https://gitee.com/mindspore/docs.git

git@gitee.com:mindspore/docs.git

mindspore

docs

docs

r1.8