This article describes how to use MindSpore Profiler for performance debugging on GPU.
/home/user/code/data/
, the summary-base-dir should be /home/user/code
. After MindInsight is started, access the visualization page based on the IP address and port number. The default access IP address is http://127.0.0.1:8080
.By default, common users do not have the permission to access the NVIDIA GPU performance counters on the target device.
If common users need to use the profiler performance statistics capability in the training script, configure the permission by referring to the following description:
https://developer.nvidia.com/nvidia-development-tools-solutions-err-nvgpuctrperm-cupti
To enable the performance profiling of neural networks, MindSpore Profiler APIs should be added into the script.
The MindSpore Profiler
object needs to be initialized after set_context
is set.
In multi-card training scene,
Profiler
object needs to be initialized afterset_auto_parallel_context
.Only the output_path in parameters is working in GPU now.
At the end of the training, Profiler.analyse
should be called to finish profiling and generate the performance analysis results.
The sample code is the same as that in the Ascend chip: https://www.mindspore.cn/mindinsight/docs/en/r1.5/performance_profiling_ascend.html#preparing-the-training-script.
In GPU scenarios, users can customize the callback mode to collect performance data. Data preparation stage and data sinking mode do not support this mode.
The following is the example:
class StopAtStep(Callback):
def __init__(self, start_step, stop_step):
super(StopAtStep, self).__init__()
self.start_step = start_step
self.stop_step = stop_step
self.already_analysed = False
def step_begin(self, run_context):
cb_params = run_context.original_args()
step_num = cb_params.cur_step_num
if step_num == self.start_step:
self.profiler = Profiler()
def step_end(self, run_context):
cb_params = run_context.original_args()
step_num = cb_params.cur_step_num
if step_num == self.stop_step and not self.already_analysed:
self.profiler.analyse()
self.already_analysed = True
def end(self, run_context):
if not self.already_analysed:
self.profiler.analyse()
The code above is just an example. Users should implement callback by themselves.
The MindInsight launch command can refer to MindInsight Commands.
Users can access the Training Performance by selecting a specific training from the training list, and click the performance profiling link. And the Training Performance only supports operation analysis, Timeline Analysis, Step Trace Analysis and Data Preparation Analysis now, other modules will be published soon.
Figure 1: Overall Performance
Figure 1 displays the overall performance of the training, including the overall data of Step Trace, Operator Performance, Data Preparation Performance and Timeline:
Users can click the detail link to see the details of each components.
The operator performance analysis component is used to display the execution time of the operators when running MindSpore(include GPU operator,CUDA kernel,HOSTCPU operator).
Figure 2: Statistics for Operator Types
Figure 2 displays the statistics for the operator types, including:
The bottom half of Figure 2 displays the statistics table for the operators' details, including:
Figure 3: Statistics for Kernel Activities
Figure 3 displays the statistics for the Kernel, including:
The usage is almost the same as that in Ascend. The difference is GPU Timeline displays the operation information and CUDA activity.
The usage is described as follows:
The usage is almost the same as that in Ascend. (Note that step trace do not support heterogeneous training scene.)
The usage is described as follows:
The usage is almost the same as that in Ascend.
The usage is described as follows:
Resource utilization includes cpu usage analysis.
Figure 4:Overview of resource utilization
Overview of resource utilization:Including CPU utilization analysis. You can view the details by clicking the View Details button in the upper right corner.
The usage is almost the same as that in Ascend.
The usage is described as follows:
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。