2.4K Star 8.3K Fork 4.5K

GVPMindSpore/mindspore

 / 详情

[MS][用户接口-MarginRankingLoss]test case has AssertionError on Ascend

DONE
Bug-Report
创建于  
2022-12-24 17:01
name about labels
Bug Report Use this template for reporting a bug kind/bug

Describe the current behavior / 问题描述 (Mandatory / 必填)

在Ascend平台,调用grad_cmp函数时,绝大多数的用例都有精度问题

Environment / 环境信息 (Mandatory / 必填)

  • Hardware Environment(Ascend/GPU/CPU) / 硬件环境:

Please delete the backend not involved / 请删除不涉及的后端:
/device ascend

  • Software Environment / 软件环境 (Mandatory / 必填):
    -- MindSpore version (e.g., 1.7.0.Bxxx) :
    -- Python version (e.g., Python 3.7.5) :
    -- OS platform and distribution (e.g., Linux Ubuntu 16.04):
    -- GCC/Compiler version (if compiled from source):

  • Excute Mode / 执行模式 (Mandatory / 必填)(PyNative/Graph):

Please delete the mode not involved / 请删除不涉及的模式:
/mode pynative
/mode graph

Related testcase / 关联用例 (Mandatory / 必填)

test_nn_marginrankingloss_reduction_default
test_nn_marginrankingloss_reduction_sum
test_nn_marginrankingloss_reduction_none
test_nn_marginrankingloss_input_dtype_float16
test_nn_marginrankingloss_input_dtype_float32
test_nn_marginrankingloss_input_dtype_float64
test_nn_marginrankingloss_input_attribute_margin
test_nn_marginrankingloss_1d_float32
test_nn_marginrankingloss_2d_float32
test_nn_marginrankingloss_3d_float16
test_nn_marginrankingloss_4d_float16
test_nn_marginrankingloss_5d_float32
test_nn_marginrankingloss_6d_float16
test_nn_marginrankingloss_7d_float16

Steps to reproduce the issue / 重现步骤 (Mandatory / 必填)

  1. def test_nn_marginrankingloss_reduction_default():
    logits1 = Tensor(np.random.randn(2, 2).astype(np.float32))
    logits2 = Tensor(np.random.randn(2, 2).astype(np.float32))
    labels = Tensor((np.random.randint(2, size=(2, 2)) * 2 - 1).astype(np.float32))
    input_list = [logits1, logits2, labels]
    fact = MarginRankingLossMock(inputs=input_list)
    fact.forward_cmp()
  fact.grad_cmp()

test_nn_marginrankingloss.py:30:


../share/ops/nn/marginrankingloss_ops.py:68: in grad_cmp
allclose_nparray(data_expected, data_me, self.loss, self.loss)
../share/utils.py:31: in allclose_nparray
_count_unequal_element(data_expected, data_me, rtol, atol)


data_expected = array([[0. , 0.07458219],
[0.07458219, 0. ]], dtype=float32)
data_me = array([[0., 0.],
[0., 0.]], dtype=float32), rtol = 0.001, atol = 0.001

def _count_unequal_element(data_expected, data_me, rtol, atol):
    assert data_expected.shape == data_me.shape
    total_count = len(data_expected.flatten())
    error = np.abs(data_expected - data_me)
    greater = np.greater(error, atol + np.abs(data_me) * rtol)
    loss_count = np.count_nonzero(greater)
    assert (loss_count / total_count) < rtol, \
        "\ndata_expected_std:{0}\ndata_me_error:{1}\nloss:{2}". \
          format(data_expected[greater], data_me[greater], error[greater])

E AssertionError:
E data_expected_std:[0.07458219 0.07458219]
E data_me_error:[0. 0.]
E loss:[0.07458219 0.07458219]

../share/utils.py:24: AssertionError

  1. _________________________________________________________ test_nn_marginrankingloss_2d_float32 ___________________________________________________________

    def test_nn_marginrankingloss_2d_float32():
    logits1 = Tensor(np.random.randn(2, 2).astype(np.float32))
    logits2 = Tensor(np.random.randn(2, 2).astype(np.float32))
    labels = Tensor((np.random.randint(2, size=(2, 2)) * 2 - 1).astype(np.float32))
    input_list = [logits1, logits2, labels]
    fact = MarginRankingLossMock(inputs=input_list)
    fact.forward_cmp()

  fact.grad_cmp()

test_nn_marginrankingloss.py:208:


../share/ops/nn/marginrankingloss_ops.py:68: in grad_cmp
allclose_nparray(data_expected, data_me, self.loss, self.loss)
../share/utils.py:31: in allclose_nparray
_count_unequal_element(data_expected, data_me, rtol, atol)


data_expected = array([[ 0.04290303, -0.04290303],
[ 0.04290303, 0.04290303]], dtype=float32)
data_me = array([[0., 0.],
[0., 0.]], dtype=float32), rtol = 0.001, atol = 0.001

def _count_unequal_element(data_expected, data_me, rtol, atol):
    assert data_expected.shape == data_me.shape
    total_count = len(data_expected.flatten())
    error = np.abs(data_expected - data_me)
    greater = np.greater(error, atol + np.abs(data_me) * rtol)
    loss_count = np.count_nonzero(greater)
    assert (loss_count / total_count) < rtol, \
        "\ndata_expected_std:{0}\ndata_me_error:{1}\nloss:{2}". \
          format(data_expected[greater], data_me[greater], error[greater])

E AssertionError:
E data_expected_std:[ 0.04290303 -0.04290303 0.04290303 0.04290303]
E data_me_error:[0. 0. 0. 0.]
E loss:[0.04290303 0.04290303 0.04290303 0.04290303]

../share/utils.py:24: AssertionError
3.

Describe the expected behavior / 预期结果 (Mandatory / 必填)

all pass

Related log / screenshot / 日志 / 截图 (Mandatory / 必填)

Special notes for this issue/备注 (Optional / 选填)

评论 (5)

沈竞兴 创建了Bug-Report
沈竞兴 添加了
 
kind/bug
标签
沈竞兴 添加了
 
sig/ops
标签
沈竞兴 添加了
 
v2.0.0.rc1
标签
展开全部操作日志

Please assign maintainer to check this issue.
请为此issue分配处理人。
@沈竞兴

Please add labels (comp or sig), also you can visit https://gitee.com/mindspore/community/blob/master/sigs/dx/docs/labels.md to find more.
为了让代码尽快被审核,请您为Pull Request打上 组件(comp)或兴趣组(sig) 标签,打上标签的PR可直接推送给责任人进行审核。
更多的标签可以查看https://gitee.com/mindspore/community/blob/master/sigs/dx/docs/labels.md
以组件相关代码提交为例,如果你提交的是data组件代码,你可以这样评论:
//comp/data
当然你也可以邀请data SIG组来审核代码,可以这样写:
//sig/data
另外你还可以给这个PR标记类型,例如是bugfix或者是特性需求:
//kind/bug or //kind/feature
恭喜你,你已经学会了使用命令来打标签,接下来就在下面的评论里打上标签吧!

沈竞兴 修改了描述
gaoshuanglong 负责人gaoshuanglong 修改为冯一航
gaoshuanglong 里程碑设置为B-SIG-OPS
冯一航 里程碑B-SIG-OPS 修改为B-SIG-FrontEnd

初步定位发现P.Maximum算子在第一个入参为number的场景下,模式设置为静态图时会出现反向计算错误,导致MarginRankingLoss的反向计算错误,因此出现用例failed的情况,接口层面采取交换接口参数顺序的方式来规避错误,算子反向异常需继续跟踪。

Appearance & Root Cause

P.Maximum算子在第一个入参为number的场景下,模式设置为静态图时会出现反向计算错误

Fix Solution

交换接口参数顺序规避错误
PR:
!47572:modify marginrankingloss errors

冯一航 添加协作者冯一航
冯一航 负责人冯一航 修改为gaoshuanglong
冯一航 取消协作者gaoshuanglong
冯一航 里程碑B-SIG-FrontEnd 修改为B-SIG-OPS
冯一航 任务状态TODO 修改为VALIDATION
冯一航 添加了
 
rct/bugfix
标签
冯一航 添加了
 
rca/others
标签
冯一航 添加了
 
ctl/solutiontest
标签
gaoshuanglong 负责人gaoshuanglong 修改为沈竞兴

精度问题已修复,float64类型在Ascend平台不支持已重新提单

沈竞兴 任务状态VALIDATION 修改为DONE

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(4)
8996751 cooinga 1685590095
Python
1
https://gitee.com/mindspore/mindspore.git
git@gitee.com:mindspore/mindspore.git
mindspore
mindspore
mindspore

搜索帮助

344bd9b3 5694891 D2dac590 5694891