npu无法通过bool tensor索引tensor，NotImplement

`linux-aarch64 cann 8.0.RC2 torch 2.1.0 torch-npu 2.1.0post8`

cpu和gpu都可以使用bool tensor索引tensor的，如values[masks == 1]去得到新矩阵，但是npu不行

**应该是未对相关计算进行实现或者重载，辛苦大大帮忙实现一下**

cpu版本

```
(ll3da) [ma-user work]$python
Python 3.8.20 | packaged by conda-forge | (default, Sep 30 2024, 17:47:05) 
[GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch, torch_npu
x = /home/ma-user/work/miniforge3/envs/ll3da/lib/python3.8/site-packages/torch_npu/utils/collect_env.py:59: UserWarning: Warning: The /usr/local/Ascend/ascend-toolkit/latest owner does not match the current owner.
  warnings.warn(f"Warning: The {path} owner does not match the current owner.")
/home/ma-user/work/miniforge3/envs/ll3da/lib/python3.8/site-packages/torch_npu/utils/collect_env.py:59: UserWarning: Warning: The /usr/local/Ascend/ascend-toolkit/8.0.RC2/aarch64-linux/ascend_toolkit_install.info owner does not match the current owner.
  warnings.warn(f"Warning: The {path} owner does not match the current owner.")
/home/ma-user/work/miniforge3/envs/ll3da/lib/python3.8/site-packages/torch_npu/__init__.py:234: UserWarning: On the interactive interface, the value of TASK_QUEUE_ENABLE is set to 0 by default.                      Do not set it to 1 to prevent some unknown errors
  warnings.warn("On the interactive interface, the value of TASK_QUEUE_ENABLE is set to 0 by default. \
>>> x = torch.randn(4)
>>> x.shape
torch.Size([4])
>>> x
tensor([-0.9549, -0.0406, -0.9176,  0.7385])
>>> ib = torch.tensor([True, True, False, False])
>>> ib
tensor([ True,  True, False, False])
>>> type(ib)
<class 'torch.Tensor'>
>>> x[ib]
tensor([-0.9549, -0.0406])
```

在npu上运行报错，发现NotImplement

```
>>> x, ib = x.npu(), ib.npu()
>>> x,ib
[W compiler_depend.ts:137] Warning: Warning: Device do not support double dtype now, dtype cast repalce with float. (function operator())
(tensor([0.7828, 1.8992, 0.3072, 0.6014], device='npu:0'), tensor([ True,  True, False, False], device='npu:0'))
>>> x[ib]
/
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: InnerRun:build/CMakeFiles/torch_npu.dir/compiler_depend.ts:216 OPS function error: NonZero, error code is 500002
[ERROR] 2024-12-02-20:37:21 (PID:2438447, Device:0, RankID:-1) ERR01100 OPS call acl api failed
[Error]: A GE error occurs in the system.
        Rectify the fault based on the error information in the ascend log.
E40021: 2024-12-02-20:37:21.407.830 Failed to compile Op [NonZero13_Output0CastAfter]. (oppath: [Compile /usr/local/Ascend/ascend-toolkit/8.0.RC2/opp/built-in/op_impl/ai_core/tbe/impl/dynamic/cast.py failed with errormsg/stack: File "/home/ma-user/work/miniforge3/envs/ll3da/lib/python3.8/multiprocessing/queues.py", line 120, in qsize
    return self._maxsize - self._sem._semlock._get_value()
NotImplementedError
], optype: [Cast])
        Solution: See the host log for details, and then check the Python stack where the error log is reported.
        TraceBack (most recent call last):
        Compile op[NonZero13_Output0CastAfter] failed, oppath[/usr/local/Ascend/ascend-toolkit/8.0.RC2/opp/built-in/op_impl/ai_core/tbe/impl/dynamic/cast.py], optype[Cast], taskID[23]. Please check op's compilation error message.[FUNC:ReportBuildErrMessage][FILE:fusion_manager.cc][LINE:748]
        [SubGraphOpt][Compile][ProcFailedCompTask] Thread[281472536941024] recompile single op[NonZero13_Output0CastAfter] failed[FUNC:ProcessAllFailedCompileTasks][FILE:tbe_op_store_adapter.cc][LINE:961]
        [SubGraphOpt][Compile][ParalCompOp] Thread[281472536941024] process fail task failed[FUNC:ParallelCompileOp][FILE:tbe_op_store_adapter.cc][LINE:1009]
        [SubGraphOpt][Compile][CompOpOnly] CompileOp failed.[FUNC:CompileOpOnly][FILE:op_compiler.cc][LINE:1112]
        [GraphOpt][FusedGraph][RunCompile] Failed to compile graph with compiler Normal mode Op Compiler[FUNC:SubGraphCompile][FILE:fe_graph_optimizer.cc][LINE:1420]
        Call OptimizeFusedGraph failed, ret:-1, engine_name:AIcoreEngine, graph_name:partition0_rank1_new_sub_graph1[FUNC:OptimizeSubGraph][FILE:graph_optimize.cc][LINE:119]
        subgraph 0 optimize failed[FUNC:OptimizeSubGraphWithMultiThreads][FILE:graph_manager.cc][LINE:1012]
        build graph failed, graph id:12, ret:-1[FUNC:BuildModelWithGraphId][FILE:ge_generator.cc][LINE:1608]
        [Build][SingleOpModel]call ge interface generator.BuildSingleOpModel failed. ge result = 4294967295[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
        [Build][Op]Fail to build op model[FUNC:ReportInnerError][FILE:log_inner.cpp][LINE:145]
        build op model failed, result = 500002[FUNC:ReportInnerError][FILE:log_inner.cpp][LINE:145]
```

Ascend/pytorch
暂停

内容风险标识

评论 (4)

Ascend/pytorch暂停 .gitee-modal { width: 500px !important; }

内容风险标识