使用张量作为索引时mindspore和pytorch得到的张量形状不同

name	about	labels
Bug Report	Use this template for reporting a bug	kind/bug

Describe the current behavior / 问题描述 (Mandatory / 必填)

在进行big_bird模型迁移时，在对张量进行张量索引得到的结果和pytorch不同。对此我采用了简单例子展示问题。

Environment / 环境信息 (Mandatory / 必填)

Hardware Environment(Ascend/GPU/CPU) / 硬件环境: CPU

Please delete the backend not involved / 请删除不涉及的后端:
/device ascend/GPU/CPU/kirin/等其他芯片

Software Environment / 软件环境 (Mandatory / 必填):
-- MindSpore version (e.g., 2.2.11) :
-- Python version (e.g., Python 3.9.16) :
-- OS platform and distribution (e.g., Linux Ubuntu 18.04):
-- GCC/Compiler version (if compiled from source): 7.5.0
Excute Mode / 执行模式 (Mandatory / 必填)(PyNative/Graph):

项目中执行的命令：pytest -v -s tests/ut/transformers/models/big_bird
简单样例直接执行即可

Related testcase / 关联用例 (Mandatory / 必填)

import torch

i2 = torch.zeros((14, 3),dtype=int)
p1 = p2 = 0
right_slice = torch.ones((8, 24))
attn_probs_view = torch.ones((7, 4, 16, 8, 16, 8))
attn_probs_view[p1, p2, 1, :, i2[0]] = right_slice.view(8, 3, 8)
attn_probs_view[p1, p2, 1, :, i2[0]].shape

Steps to reproduce the issue / 重现步骤 (Mandatory / 必填)

import mindspore
from mindspore import ops
i2 = ops.zeros((14, 3),dtype=mindspore.int32)
p1 = p2 = 0
right_slice = ops.ones((8, 24))
attn_probs_view = ops.ones((7, 4, 16, 8, 16, 8))
attn_probs_view[p1, p2, 1, :, i2[0]] = right_slice.view(8, 3, 8)

Describe the expected behavior / 预期结果 (Mandatory / 必填)

相同的代码pytorch的attn_probs_view[p1, p2, 1, :, i2[0]].shape是(8,3,8)；
但是mindspore的attn_probs_view[p1, p2, 1, :, i2[0]].shape是(3,8,8)。

Related log / screenshot / 日志 / 截图 (Mandatory / 必填)

Traceback (most recent call last):
File "/home/yaung/下载/pycharm-2023.2.3/plugins/python/helpers/pydev/pydevconsole.py", line 364, in runcode
coro = func()
File "", line 8, in
File "/home/yaung/.local/lib/python3.9/site-packages/mindspore/common/_stub_tensor.py", line 49, in fun
return method(*arg, kwargs)
File "/home/yaung/.local/lib/python3.9/site-packages/mindspore/common/tensor.py", line 463, in setitem
out = tensor_operator_registry.get('setitem')(self, index, value)
File "/home/yaung/.local/lib/python3.9/site-packages/mindspore/ops/composite/multitype_ops/_compile_utils.py", line 214, in _tensor_setitem
output = data_update(data_update_types, data_update_args, self, new_index, value)
File "/home/yaung/.local/lib/python3.9/site-packages/mindspore/ops/composite/multitype_ops/_compile_utils.py", line 117, in data_update
new_index = format_index_tensor(new_index, arg)
File "/home/yaung/.local/lib/python3.9/site-packages/mindspore/ops/composite/multitype_ops/compile_utils.py", line 481, in format_index_tensor
index[format_idx] = F.select(index_tensor < 0, index_tensor + format_dim, index_tensor)
File "/home/yaung/.local/lib/python3.9/site-packages/mindspore/ops/function/array_func.py", line 1925, in select
x_shape = ops.shape(x)
File "/home/yaung/.local/lib/python3.9/site-packages/mindspore/ops/function/array_func.py", line 1510, in shape
return shape(input_x)
File "/home/yaung/.local/lib/python3.9/site-packages/mindspore/ops/operations/array_ops.py", line 701, in call**
return x.shape
File "/home/yaung/.local/lib/python3.9/site-packages/mindspore/common/_stub_tensor.py", line 85, in shape
self.stub_shape = self.stub.get_shape()
ValueError: For 'BroadcastTo', in order to broadcast, each dimension pair must be equal or input dimension is 1 or target dimension is -1. But got x_shape: (8, 3, 8), target shape: (3, 8, 8).

C++ Call Stack: (For framework developers)

mindspore/core/ops/broadcast_to.cc:97 BroadcastToInferShape

Special notes for this issue/备注 (Optional / 选填)

Please assign maintainer to check this issue.
请为此issue分配处理人。
@fangwenyi @chengxiaoli @Shawny

感谢您的反馈，您可以评论//mindspore-assistant更快获取帮助，更多标签可以查看标签列表：

如果您刚刚接触MindSpore，或许您可以在教程找到答案
如果您是资深Pytorch用户，您或许需要:
与PyTorch典型区别 / PyTorch与MindSpore API映射表
如果您遇到动态图问题，可以设置mindspore.set_context(pynative_synchronize=True)查看报错栈协助定位
模型精度调优问题可参考官网调优指南
如果您反馈的是框架BUG，请确认您在ISSUE中提供了MindSpore版本、使用的后端类型（CPU、GPU、Ascend）、环境、训练的代码官方链接以及可以复现报错的代码的启动方式等必要的定位信息
如果您已经定位出问题根因，欢迎提交PR参与MindSpore开源社区，我们会尽快review

改成这样即可attn_probs_view[p1, p2, 1][ :, i2[0]]

你好，问题收到，我们已安排人员分析。

执行切片后，shape与预期不一致

import mindspore
from mindspore import ops
i2 = ops.zeros((14, 3),dtype=mindspore.int32)
p1 = p2 = 0
right_slice = ops.ones((8, 24))
attn_probs_view = ops.ones((7, 4, 16, 8, 16, 8))
attn_probs_view[p1, p2, 1, :, i2[0]].shape
(3, 8, 8)
import torch

i2 = torch.zeros((14, 3),dtype=int)
p1 = p2 = 0
right_slice = torch.ones((8, 24))
attn_probs_view = torch.ones((7, 4, 16, 8, 16, 8))
attn_probs_view[p1, p2, 1, :, i2[0]].shape
torch.Size([8, 3, 8])

import torch
import mindspore as ms
from mindspore import ops
import numpy as np

i2 = torch.tensor([0, 0, 0], dtype=int)
attn_probs_view = torch.ones((7, 4, 16, 8, 16, 8))
print(attn_probs_view[0, 0, 1, :, i2].shape)
# torch.Size([8, 3, 8])

i2 = ms.Tensor([0, 0, 0], dtype=ms.int32)
attn_probs_view = ops.ones((7, 4, 16, 8, 16, 8))
print(attn_probs_view[0, 0, 1, :, i2].shape)
# (3, 8, 8)

i2 = np.array([0, 0, 0], dtype=np.int32)
attn_probs_view = np.ones((7, 4, 16, 8, 16, 8))
print(attn_probs_view[0, 0, 1, :, i2].shape)
# (3, 8, 8)

import torch
import mindspore as ms
from mindspore import ops
import numpy as np

print("numpy:")
data = np.arange(24).reshape(2, 3, 4)
print(data[0, :, [0, 0]])
print(data[0, :, [0, 0]].shape)
# [[0 4 8]
#  [0 4 8]]
# (2, 3)
print(data[0][:, [0, 0]])
print(data[0][:, [0, 0]].shape)
# [[0 0]
#  [4 4]
#  [8 8]]
# (3, 2)

print("\npytorch:")
data = torch.tensor(np.arange(24).reshape(2, 3, 4))
print(data[0, :, [0, 0]])
print(data[0, :, [0, 0]].shape)
# tensor([[0, 0],
#         [4, 4],
#         [8, 8]])
# torch.Size([3, 2])
print(data[0][:, [0, 0]])
print(data[0][:, [0, 0]].shape)
# tensor([[0, 0],
#         [4, 4],
#         [8, 8]])
# torch.Size([3, 2])

print("\nmindspore:")
data = ops.Tensor(np.arange(24).reshape(2, 3, 4))
print(data[0, :, [0, 0]])
print(data[0, :, [0, 0]].shape)
# [[0 4 8]
#  [0 4 8]]
# (2, 3)
print(data[0][:, [0, 0]])
print(data[0][:, [0, 0]].shape)
# [[0 0]
#  [4 4]
#  [8 8]]
# (3, 2)

您好，由于问题单没有回复，我们后续会关闭，如您仍有疑问，可以反馈下具体信息，并将ISSUE状态修改为WIP，我们这边会进一步跟踪，谢谢

MindSpore / community

内容风险标识

Describe the current behavior / 问题描述 (Mandatory / 必填)

Environment / 环境信息 (Mandatory / 必填)

Related testcase / 关联用例 (Mandatory / 必填)

Steps to reproduce the issue / 重现步骤 (Mandatory / 必填)

Describe the expected behavior / 预期结果 (Mandatory / 必填)

Related log / screenshot / 日志 / 截图 (Mandatory / 必填)

Special notes for this issue/备注 (Optional / 选填)

评论 (10)

MindSpore / community .gitee-modal { width: 500px !important; }

内容风险标识