67 Star 372 Fork 843

Ascend / pytorch

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
PyTorch API支持清单_1.5.0.md 165.37 KB
一键复制 编辑 原始数据 按行查看 历史
彭子荣 提交于 2022-01-07 11:32 . fix error

Tensors

序号 API名称 支持情况
1 torch.is_tensor
2 torch.is_storage
3 torch.is_complex 是,支持判断,但当前硬件限制不支持复数
4 torch.is_floating_point
5 torch.set_default_dtype
6 torch.get_default_dtype
7 torch.set_default_tensor_type
8 torch.numel
9 torch.set_printoptions
10 torch.set_flush_denormal
11 torch.tensor
12 torch.sparse_coo_tensor
13 torch.as_tensor
14 torch.as_strided
15 torch.from_numpy
16 torch.zeros
17 torch.zeros_like
18 torch.ones
19 torch.ones_like
20 torch.arange
21 torch.range
22 torch.linspace
23 torch.logspace
24 torch.eye
25 torch.empty
26 torch.empty_like
27 torch.empty_strided
28 torch.full
29 torch.full_like
30 torch.quantize_per_tensor
31 torch.quantize_per_channel
32 torch.cat
33 torch.chunk
34 torch.gather
35 torch.index_select
36 torch.masked_select
37 torch.narrow
38 torch.nonzero
39 torch.reshape
40 torch.split
41 torch.squeeze
42 torch.stack
43 torch.t
44 torch.take
45 torch.transpose
46 torch.unbind
47 torch.unsqueeze
48 torch.where

Generators

序号 API名称 是否支持
1 torch._C.Generator
2 torch._C.Generator.device
3 torch._C.Generator.get_state
4 torch._C.Generator.initial_seed
5 torch._C.Generator.manual_seed
6 torch._C.Generator.seed
7 torch._C.Generator.set_state

Random sampling

序号 API名称 是否支持
1 torch.seed
2 torch.manual_seed
3 torch.initial_seed
4 torch.get_rng_state
5 torch.set_rng_state
6 torch.torch.default_generator
7 torch.bernoulli
8 torch.multinomial
9 torch.normal
10 torch.poisson
11 torch.rand
12 torch.rand_like
13 torch.randint
14 torch.randint_like
15 torch.randn
16 torch.randn_like
17 torch.randperm
18 torch.Tensor.bernoulli_()
19 torch.Tensor.bernoulli_()
20 torch.Tensor.exponential_()
21 torch.Tensor.geometric_()
22 torch.Tensor.log_normal_()
23 torch.Tensor.normal_()
24 torch.Tensor.random_()
25 torch.Tensor.uniform_()
26 torch.quasirandom.SobolEngine
27 torch.quasirandom.SobolEngine.draw
28 torch.quasirandom.SobolEngine.fast_forward
29 torch.quasirandom.SobolEngine.reset

Serialization

序号 API名称 是否支持
1 torch.save
2 torch.load

Math operations

序号 API名称 是否支持
1 torch.abs
2 torch.acos
3 torch.add
4 torch.addcdiv
5 torch.addcmul
6 torch.angle
7 torch.asin
8 torch.atan
9 torch.atan2
10 torch.bitwise_not
11 torch.bitwise_and
12 torch.bitwise_or
13 torch.bitwise_xor
14 torch.ceil
15 torch.clamp
16 torch.conj
17 torch.cos
18 torch.cosh
19 torch.div
20 torch.digamma
21 torch.erf
22 torch.erfc
23 torch.erfinv
24 torch.exp
25 torch.expm1
26 torch.floor
27 torch.floor_divide
28 torch.fmod
29 torch.frac
30 torch.imag
31 torch.lerp
32 torch.lgamma
33 torch.log
34 torch.log10
35 torch.log1p
36 torch.log2
37 torch.logical_and
38 torch.logical_not
39 torch.logical_or
40 torch.logical_xor
41 torch.mul
42 torch.mvlgamma
43 torch.neg
44 torch.polygamma
45 torch.pow
46 torch.real
47 torch.reciprocal
48 torch.remainder
49 torch.round
50 torch.rsqrt
51 torch.sigmoid
52 torch.sign
53 torch.sin
54 torch.sinh
55 torch.sqrt
56 torch.square
57 torch.tan
58 torch.tanh
59 torch.true_divide
60 torch.trunc
61 torch.argmax
62 torch.argmin
63 torch.dist
64 torch.logsumexp
65 torch.mean
66 torch.median
67 torch.mode
68 torch.norm
69 torch.prod
70 torch.std
71 torch.std_mean
72 torch.sum
73 torch.unique
74 torch.unique_consecutive
75 torch.var
76 torch.var_mean
77 torch.allclose
78 torch.argsort
79 torch.eq
80 torch.equal
81 torch.ge
82 torch.gt
83 torch.isfinite
84 torch.isinf
85 torch.isnan
86 torch.kthvalue
87 torch.le
88 torch.lt
89 torch.max
90 torch.min
91 torch.ne
92 torch.sort
93 torch.topk
94 torch.fft
95 torch.ifft
96 torch.rfft
97 torch.irfft
98 torch.stft
99 torch.bartlett_window
100 torch.blackman_window
101 torch.hamming_window
102 torch.hann_window
103 torch.bincount
104 torch.broadcast_tensors
105 torch.cartesian_prod
106 torch.cdist
107 torch.combinations
108 torch.cross
109 torch.cummax
110 torch.cummin
111 torch.cumprod
112 torch.cumsum
113 torch.diag
114 torch.diag_embed
115 torch.diagflat
116 torch.diagonal
117 torch.einsum
118 torch.flatten
119 torch.flip
120 torch.rot90
121 torch.histc
122 torch.meshgrid
123 torch.renorm
124 torch.repeat_interleave
125 torch.roll
126 torch.tensordot
127 torch.trace
128 torch.tril
129 torch.tril_indices
130 torch.triu
131 torch.triu_indices
132 torch.addbmm
133 torch.addmm
134 torch.addmv
135 torch.addr
136 torch.baddbmm
137 torch.bmm
138 torch.chain_matmul
139 torch.cholesky
140 torch.cholesky_inverse
141 torch.cholesky_solve
142 torch.dot
143 torch.eig
144 torch.geqrf
145 torch.ger
146 torch.inverse
147 torch.det
148 torch.logdet
149 torch.slogdet
150 torch.lstsq
151 torch.lu
152 torch.lu_solve
153 torch.lu_unpack
154 torch.matmul
155 torch.matrix_power
156 torch.matrix_rank
157 torch.mm
158 torch.mv
159 torch.orgqr
160 torch.ormqr
161 torch.pinverse
162 torch.qr
163 torch.solve
164 torch.svd
165 torch.svd_lowrank
166 torch.pca_lowrank
167 torch.symeig
168 torch.lobpcg
169 torch.trapz
170 torch.triangular_solve

Utilities

序号 API名称 是否支持
1 torch.compiled_with_cxx11_abi
2 torch.result_type
3 torch.can_cast
4 torch.promote_types

Other

序号 API名称 是否支持
1 torch.no_grad
2 torch.enable_grad
3 torch.set_grad_enabled
4 torch.get_num_threads
5 torch.set_num_threads
6 torch.get_num_interop_threads
7 torch.set_num_interop_threads

torch.Tensor

序号 API名称 是否支持
1 torch.Tensor
2 torch.Tensor.new_tensor
3 torch.Tensor.new_full
4 torch.Tensor.new_empty
5 torch.Tensor.new_ones
6 torch.Tensor.new_zeros
7 torch.Tensor.is_cuda
8 torch.Tensor.is_quantized
9 torch.Tensor.device
10 torch.Tensor.ndim
11 torch.Tensor.T
12 torch.Tensor.abs
13 torch.Tensor.abs_
14 torch.Tensor.acos
15 torch.Tensor.acos_
16 torch.Tensor.add
17 torch.Tensor.add_
18 torch.Tensor.addbmm
19 torch.Tensor.addbmm_
20 torch.Tensor.addcdiv
21 torch.Tensor.addcdiv_
22 torch.Tensor.addcmul
23 torch.Tensor.addcmul_
24 torch.Tensor.addmm
25 torch.Tensor.addmm_
26 torch.Tensor.addmv
27 torch.Tensor.addmv_
28 torch.Tensor.addr
29 torch.Tensor.addr_
30 torch.Tensor.allclose
31 torch.Tensor.angle
32 torch.Tensor.apply_
33 torch.Tensor.argmax
34 torch.Tensor.argmin
35 torch.Tensor.argsort
36 torch.Tensor.asin
37 torch.Tensor.asin_
38 torch.Tensor.as_strided
39 torch.Tensor.atan
40 torch.Tensor.atan2
41 torch.Tensor.atan2_
42 torch.Tensor.atan_
43 torch.Tensor.baddbmm
44 torch.Tensor.baddbmm_
45 torch.Tensor.bernoulli
46 torch.Tensor.bernoulli_
47 torch.Tensor.bfloat16
48 torch.Tensor.bincount
49 torch.Tensor.bitwise_not
50 torch.Tensor.bitwise_not_
51 torch.Tensor.bitwise_and
52 torch.Tensor.bitwise_and_
53 torch.Tensor.bitwise_or
54 torch.Tensor.bitwise_or_
55 torch.Tensor.bitwise_xor
56 torch.Tensor.bitwise_xor_
57 torch.Tensor.bmm
58 torch.Tensor.bool
59 torch.Tensor.byte
60 torch.Tensor.cauchy_
61 torch.Tensor.ceil
62 torch.Tensor.ceil_
63 torch.Tensor.char
64 torch.Tensor.cholesky
65 torch.Tensor.cholesky_inverse
66 torch.Tensor.cholesky_solve
67 torch.Tensor.chunk
68 torch.Tensor.clamp
69 torch.Tensor.clamp_
70 torch.Tensor.clone
71 torch.Tensor.contiguous
72 torch.Tensor.copy_
73 torch.Tensor.conj
74 torch.Tensor.cos
75 torch.Tensor.cos_
76 torch.Tensor.cosh
77 torch.Tensor.cosh_
78 torch.Tensor.cpu
79 torch.Tensor.cross
80 torch.Tensor.cuda
81 torch.Tensor.cummax
82 torch.Tensor.cummin
83 torch.Tensor.cumprod
84 torch.Tensor.cumsum
85 torch.Tensor.data_ptr
86 torch.Tensor.dequantize
87 torch.Tensor.det
88 torch.Tensor.dense_dim
89 torch.Tensor.diag
90 torch.Tensor.diag_embed
91 torch.Tensor.diagflat
92 torch.Tensor.diagonal
93 torch.Tensor.fill_diagonal_
94 torch.Tensor.digamma
95 torch.Tensor.digamma_
96 torch.Tensor.dim
97 torch.Tensor.dist
98 torch.Tensor.div
99 torch.Tensor.div_
100 torch.Tensor.dot
101 torch.Tensor.double
102 torch.Tensor.eig
103 torch.Tensor.element_size
104 torch.Tensor.eq
105 torch.Tensor.eq_
106 torch.Tensor.equal
107 torch.Tensor.erf
108 torch.Tensor.erf_
109 torch.Tensor.erfc
110 torch.Tensor.erfc_
111 torch.Tensor.erfinv
112 torch.Tensor.erfinv_
113 torch.Tensor.exp
114 torch.Tensor.exp_
115 torch.Tensor.expm1
116 torch.Tensor.expm1_
117 torch.Tensor.expand
118 torch.Tensor.expand_as
119 torch.Tensor.exponential_
120 torch.Tensor.fft
121 torch.Tensor.fill_
122 torch.Tensor.flatten
123 torch.Tensor.flip
124 torch.Tensor.float
125 torch.Tensor.floor
126 torch.Tensor.floor_
127 torch.Tensor.floor_divide
128 torch.Tensor.floor_divide_
129 torch.Tensor.fmod
130 torch.Tensor.fmod_
131 torch.Tensor.frac
132 torch.Tensor.frac_
133 torch.Tensor.gather
134 torch.Tensor.ge
135 torch.Tensor.ge_
136 torch.Tensor.geometric_
137 torch.Tensor.geqrf
138 torch.Tensor.ger
139 torch.Tensor.get_device
140 torch.Tensor.gt
141 torch.Tensor.gt_
142 torch.Tensor.half
143 torch.Tensor.hardshrink
144 torch.Tensor.histc
145 torch.Tensor.ifft
146 torch.Tensor.index_add_
147 torch.Tensor.index_add
148 torch.Tensor.index_copy_
149 torch.Tensor.index_copy
150 torch.Tensor.index_fill_
151 torch.Tensor.index_fill
152 torch.Tensor.index_put_
153 torch.Tensor.index_put
154 torch.Tensor.index_select
155 torch.Tensor.indices
156 torch.Tensor.int
157 torch.Tensor.int_repr
158 torch.Tensor.inverse
159 torch.Tensor.irfft
160 torch.Tensor.is_contiguous
161 torch.Tensor.is_complex
162 torch.Tensor.is_floating_point
163 torch.Tensor.is_pinned
164 torch.Tensor.is_set_to
165 torch.Tensor.is_shared
166 torch.Tensor.is_signed
167 torch.Tensor.is_sparse
168 torch.Tensor.item
169 torch.Tensor.kthvalue
170 torch.Tensor.le
171 torch.Tensor.le_
172 torch.Tensor.lerp
173 torch.Tensor.lerp_
174 torch.Tensor.lgamma
175 torch.Tensor.lgamma_
176 torch.Tensor.log
177 torch.Tensor.log_
178 torch.Tensor.logdet
179 torch.Tensor.log10
180 torch.Tensor.log10_
181 torch.Tensor.log1p
182 torch.Tensor.log1p_
183 torch.Tensor.log2
184 torch.Tensor.log2_
185 torch.Tensor.log_normal_
186 torch.Tensor.logsumexp
187 torch.Tensor.logical_and
188 torch.Tensor.logical_and_
189 torch.Tensor.logical_not
190 torch.Tensor.logical_not_
191 torch.Tensor.logical_or
192 torch.Tensor.logical_or_
193 torch.Tensor.logical_xor
194 torch.Tensor.logical_xor_
195 torch.Tensor.long
196 torch.Tensor.lstsq
197 torch.Tensor.lt
198 torch.Tensor.lt_
199 torch.Tensor.lu
200 torch.Tensor.lu_solve
201 torch.Tensor.map_
202 torch.Tensor.masked_scatter_
203 torch.Tensor.masked_scatter
204 torch.Tensor.masked_fill_
205 torch.Tensor.masked_fill
206 torch.Tensor.masked_select
207 torch.Tensor.matmul
208 torch.Tensor.matrix_power
209 torch.Tensor.max
210 torch.Tensor.mean
211 torch.Tensor.median
212 torch.Tensor.min
213 torch.Tensor.mm
214 torch.Tensor.mode
215 torch.Tensor.mul
216 torch.Tensor.mul_
217 torch.Tensor.multinomial
218 torch.Tensor.mv
219 torch.Tensor.mvlgamma
220 torch.Tensor.mvlgamma_
221 torch.Tensor.narrow
222 torch.Tensor.narrow_copy
223 torch.Tensor.ndimension
224 torch.Tensor.ne
225 torch.Tensor.ne_
226 torch.Tensor.neg
227 torch.Tensor.neg_
228 torch.Tensor.nelement
229 torch.Tensor.nonzero
230 torch.Tensor.norm
231 torch.Tensor.normal_
232 torch.Tensor.numel
233 torch.Tensor.numpy
234 torch.Tensor.orgqr
235 torch.Tensor.ormqr
236 torch.Tensor.permute
237 torch.Tensor.pin_memory
238 torch.Tensor.pinverse
239 torch.Tensor.polygamma
240 torch.Tensor.polygamma_
241 torch.Tensor.pow
242 torch.Tensor.pow_
243 torch.Tensor.prod
244 torch.Tensor.put_
245 torch.Tensor.qr
246 torch.Tensor.qscheme
247 torch.Tensor.q_scale
248 torch.Tensor.q_zero_point
249 torch.Tensor.q_per_channel_scales
250 torch.Tensor.q_per_channel_zero_points
251 torch.Tensor.q_per_channel_axis
252 torch.Tensor.random_
253 torch.Tensor.reciprocal
254 torch.Tensor.reciprocal_
255 torch.Tensor.record_stream
256 torch.Tensor.remainder
257 torch.Tensor.remainder_
258 torch.Tensor.renorm
259 torch.Tensor.renorm_
260 torch.Tensor.repeat
261 torch.Tensor.repeat_interleave
262 torch.Tensor.requires_grad_
263 torch.Tensor.reshape
264 torch.Tensor.reshape_as
265 torch.Tensor.resize_
266 torch.Tensor.resize_as_
267 torch.Tensor.rfft
268 torch.Tensor.roll
269 torch.Tensor.rot90
270 torch.Tensor.round
271 torch.Tensor.round_
272 torch.Tensor.rsqrt
273 torch.Tensor.rsqrt_
274 torch.Tensor.scatter
275 torch.Tensor.scatter_
276 torch.Tensor.scatter_add_
277 torch.Tensor.scatter_add
278 torch.Tensor.select
279 torch.Tensor.set_
280 torch.Tensor.share_memory_
281 torch.Tensor.short
282 torch.Tensor.sigmoid
283 torch.Tensor.sigmoid_
284 torch.Tensor.sign
285 torch.Tensor.sign_
286 torch.Tensor.sin
287 torch.Tensor.sin_
288 torch.Tensor.sinh
289 torch.Tensor.sinh_
290 torch.Tensor.size
291 torch.Tensor.slogdet
292 torch.Tensor.solve
293 torch.Tensor.sort
294 torch.Tensor.split
295 torch.Tensor.sparse_mask
296 torch.Tensor.sparse_dim
297 torch.Tensor.sqrt
298 torch.Tensor.sqrt_
299 torch.Tensor.square
300 torch.Tensor.square_
301 torch.Tensor.squeeze
302 torch.Tensor.squeeze_
303 torch.Tensor.std
304 torch.Tensor.stft
305 torch.Tensor.storage
306 torch.Tensor.storage_offset
307 torch.Tensor.storage_type
308 torch.Tensor.stride
309 torch.Tensor.sub
310 torch.Tensor.sub_
311 torch.Tensor.sum
312 torch.Tensor.sum_to_size
313 torch.Tensor.svd
314 torch.Tensor.symeig
315 torch.Tensor.t
316 torch.Tensor.t_
317 torch.Tensor.to
318 torch.Tensor.to_mkldnn
319 torch.Tensor.take
320 torch.Tensor.tan
321 torch.Tensor.tan_
322 torch.Tensor.tanh
323 torch.Tensor.tanh_
324 torch.Tensor.tolist
325 torch.Tensor.topk
326 torch.Tensor.to_sparse
327 torch.Tensor.trace
328 torch.Tensor.transpose
329 torch.Tensor.transpose_
330 torch.Tensor.triangular_solve
331 torch.Tensor.tril
332 torch.Tensor.tril_
333 torch.Tensor.triu
334 torch.Tensor.triu_
335 torch.Tensor.true_divide
336 torch.Tensor.true_divide_
337 torch.Tensor.trunc
338 torch.Tensor.trunc_
339 torch.Tensor.type
340 torch.Tensor.type_as
341 torch.Tensor.unbind
342 torch.Tensor.unfold
343 torch.Tensor.uniform_
344 torch.Tensor.unique
345 torch.Tensor.unique_consecutive
346 torch.Tensor.unsqueeze
347 torch.Tensor.unsqueeze_
348 torch.Tensor.values
349 torch.Tensor.var
350 torch.Tensor.view
351 torch.Tensor.view_as
352 torch.Tensor.where
353 torch.Tensor.zero_
354 torch.BoolTensor
355 torch.BoolTensor.all
356 torch.BoolTensor.any

Layers (torch.nn)

序号 API名称 是否支持
1 torch.nn.Parameter
2 torch.nn.Module
3 torch.nn.Module.add_module
4 torch.nn.Module.apply
5 torch.nn.Module.bfloat16
6 torch.nn.Module.buffers
7 torch.nn.Module.children
8 torch.nn.Module.cpu
9 torch.nn.Module.cuda
10 torch.nn.Module.double
11 torch.nn.Module.dump_patches
12 torch.nn.Module.eval
13 torch.nn.Module.extra_repr
14 torch.nn.Module.float
15 torch.nn.Module.forward
16 torch.nn.Module.half
17 torch.nn.Module.load_state_dict
18 torch.nn.Module.modules
19 torch.nn.Module.named_buffers
20 torch.nn.Module.named_children
21 torch.nn.Module.named_modules
22 torch.nn.Module.named_parameters
23 torch.nn.Module.parameters
24 torch.nn.Module.register_backward_hook
25 torch.nn.Module.register_buffer
26 torch.nn.Module.register_forward_hook
27 torch.nn.Module.register_forward_pre_hook
28 torch.nn.Module.register_parameter
29 torch.nn.Module.requires_grad_
30 torch.nn.Module.state_dict
31 torch.nn.Module.to
32 torch.nn.Module.train
33 torch.nn.Module.type
34 torch.nn.Module.zero_grad
35 torch.nn.Sequential
36 torch.nn.ModuleList
37 torch.nn.ModuleList.append
38 torch.nn.ModuleList.extend
39 torch.nn.ModuleList.insert
40 torch.nn.ModuleDict
41 torch.nn.ModuleDict.clear
42 torch.nn.ModuleDict.items
43 torch.nn.ModuleDict.keys
44 torch.nn.ModuleDict.pop
45 torch.nn.ModuleDict.update
46 torch.nn.ModuleDict.values
47 torch.nn.ParameterList
48 torch.nn.ParameterList.append
49 torch.nn.ParameterList.extend
50 torch.nn.ParameterDict
51 torch.nn.ParameterDict.clear
52 torch.nn.ParameterDict.items
53 torch.nn.ParameterDict.keys
54 torch.nn.ParameterDict.pop
55 torch.nn.ParameterDict.update
56 torch.nn.ParameterDict.values
57 torch.nn.Conv1d
58 torch.nn.Conv2d
59 torch.nn.Conv3d
60 torch.nn.ConvTranspose1d
61 torch.nn.ConvTranspose2d
62 torch.nn.ConvTranspose3d
63 torch.nn.Unfold
64 torch.nn.Fold
65 torch.nn.MaxPool1d
66 torch.nn.MaxPool2d
67 torch.nn.MaxPool3d
68 torch.nn.MaxUnpool1d
69 torch.nn.MaxUnpool2d
70 torch.nn.MaxUnpool3d
71 torch.nn.AvgPool1d
72 torch.nn.AvgPool2d
73 torch.nn.AvgPool3d
74 torch.nn.FractionalMaxPool2d
75 torch.nn.LPPool1d
76 torch.nn.LPPool2d
77 torch.nn.AdaptiveMaxPool1d
78 torch.nn.AdaptiveMaxPool2d
79 torch.nn.AdaptiveMaxPool3d
80 torch.nn.AdaptiveAvgPool1d
81 torch.nn.AdaptiveAvgPool2d
82 torch.nn.AdaptiveAvgPool3d 是,仅支持D=1,H=1,W=1场景
83 torch.nn.ReflectionPad1d
84 torch.nn.ReflectionPad2d
85 torch.nn.ReplicationPad1d
86 torch.nn.ReplicationPad2d
87 torch.nn.ReplicationPad3d
88 torch.nn.ZeroPad2d
89 torch.nn.ConstantPad1d
90 torch.nn.ConstantPad2d
91 torch.nn.ConstantPad3d
92 torch.nn.ELU
93 torch.nn.Hardshrink
94 torch.nn.Hardtanh
95 torch.nn.LeakyReLU
96 torch.nn.LogSigmoid
97 torch.nn.MultiheadAttention
98 torch.nn.PReLU
99 torch.nn.ReLU
100 torch.nn.ReLU6
101 torch.nn.RReLU
102 torch.nn.SELU
103 torch.nn.CELU
104 torch.nn.GELU
105 torch.nn.Sigmoid
106 torch.nn.Softplus
107 torch.nn.Softshrink 是,SoftShrink场景暂不支持
108 torch.nn.Softsign
109 torch.nn.Tanh
110 torch.nn.Tanhshrink
111 torch.nn.Threshold
112 torch.nn.Softmin
113 torch.nn.Softmax
114 torch.nn.Softmax2d
115 torch.nn.LogSoftmax
116 torch.nn.AdaptiveLogSoftmaxWithLoss
117 torch.nn.AdaptiveLogSoftmaxWithLoss.log_prob
118 torch.nn.AdaptiveLogSoftmaxWithLoss.predict
119 torch.nn.BatchNorm1d
120 torch.nn.BatchNorm2d
121 torch.nn.BatchNorm3d
122 torch.nn.GroupNorm
123 torch.nn.SyncBatchNorm
124 torch.nn.SyncBatchNorm.convert_sync_batchnorm
125 torch.nn.InstanceNorm1d
126 torch.nn.InstanceNorm2d
127 torch.nn.InstanceNorm3d
128 torch.nn.LayerNorm
129 torch.nn.LocalResponseNorm
130 torch.nn.RNNBase
131 torch.nn.RNNBase.flatten_parameters
132 torch.nn.RNN
133 torch.nn.LSTM
134 torch.nn.GRU 是,DynamicGRUV2场景暂不支持
135 torch.nn.RNNCell
136 torch.nn.LSTMCell
137 torch.nn.GRUCell
138 torch.nn.Transformer
139 torch.nn.Transformer.forward
140 torch.nn.Transformer.generate_square_subsequent_mask
141 torch.nn.TransformerEncoder
142 torch.nn.TransformerEncoder.forward
143 torch.nn.TransformerDecoder
144 torch.nn.TransformerDecoder.forward
145 torch.nn.TransformerEncoderLayer
146 torch.nn.TransformerEncoderLayer.forward
147 torch.nn.TransformerDecoderLayer
148 torch.nn.TransformerDecoderLayer.forward
149 torch.nn.Identity
150 torch.nn.Linear
151 torch.nn.Bilinear
152 torch.nn.Dropout
153 torch.nn.Dropout2d
154 torch.nn.Dropout3d
155 torch.nn.AlphaDropout
156 torch.nn.Embedding
157 torch.nn.Embedding.from_pretrained
158 torch.nn.EmbeddingBag
159 torch.nn.EmbeddingBag.from_pretrained
160 torch.nn.CosineSimilarity
161 torch.nn.PairwiseDistance
162 torch.nn.L1Loss
163 torch.nn.MSELoss
164 torch.nn.CrossEntropyLoss
165 torch.nn.CTCLoss
166 torch.nn.NLLLoss
167 torch.nn.PoissonNLLLoss
168 torch.nn.KLDivLoss
169 torch.nn.BCELoss
170 torch.nn.BCEWithLogitsLoss
171 torch.nn.MarginRankingLoss
172 torch.nn.HingeEmbeddingLoss
173 torch.nn.MultiLabelMarginLoss
174 torch.nn.SmoothL1Loss
175 torch.nn.SoftMarginLoss
176 torch.nn.MultiLabelSoftMarginLoss
177 torch.nn.CosineEmbeddingLoss
178 torch.nn.MultiMarginLoss
179 torch.nn.TripletMarginLoss
180 torch.nn.PixelShuffle
181 torch.nn.Upsample
182 torch.nn.UpsamplingNearest2d
183 torch.nn.UpsamplingBilinear2d
184 torch.nn.DataParallel
185 torch.nn.parallel.DistributedDataParallel
186 torch.nn.parallel.DistributedDataParallel.no_sync
187 torch.nn.utils.clip_grad_norm_
188 torch.nn.utils.clip_grad_value_
189 torch.nn.utils.parameters_to_vector
190 torch.nn.utils.vector_to_parameters
197 torch.nn.utils.prune.PruningContainer
198 torch.nn.utils.prune.PruningContainer.add_pruning_method
199 torch.nn.utils.prune.PruningContainer.apply
200 torch.nn.utils.prune.PruningContainer.apply_mask
201 torch.nn.utils.prune.PruningContainer.compute_mask
202 torch.nn.utils.prune.PruningContainer.prune
203 torch.nn.utils.prune.PruningContainer.remove
204 torch.nn.utils.prune.Identity
205 torch.nn.utils.prune.Identity.apply
206 torch.nn.utils.prune.Identity.apply_mask
207 torch.nn.utils.prune.Identity.prune
208 torch.nn.utils.prune.Identity.remove
209 torch.nn.utils.prune.RandomUnstructured
210 torch.nn.utils.prune.RandomUnstructured.apply
211 torch.nn.utils.prune.RandomUnstructured.apply_mask
212 torch.nn.utils.prune.RandomUnstructured.prune
213 torch.nn.utils.prune.RandomUnstructured.remove
214 torch.nn.utils.prune.L1Unstructured
215 torch.nn.utils.prune.L1Unstructured.apply
216 torch.nn.utils.prune.L1Unstructured.apply_mask
217 torch.nn.utils.prune.L1Unstructured.prune
218 torch.nn.utils.prune.L1Unstructured.remove
219 torch.nn.utils.prune.RandomStructured
220 torch.nn.utils.prune.RandomStructured.apply
221 torch.nn.utils.prune.RandomStructured.apply_mask
222 torch.nn.utils.prune.RandomStructured.compute_mask
223 torch.nn.utils.prune.RandomStructured.prune
224 torch.nn.utils.prune.RandomStructured.remove
225 torch.nn.utils.prune.LnStructured
226 torch.nn.utils.prune.LnStructured.apply
227 torch.nn.utils.prune.LnStructured.apply_mask
228 torch.nn.utils.prune.LnStructured.compute_mask
229 torch.nn.utils.prune.LnStructured.prune
230 torch.nn.utils.prune.LnStructured.remove
231 torch.nn.utils.prune.CustomFromMask
232 torch.nn.utils.prune.CustomFromMask.apply
233 torch.nn.utils.prune.CustomFromMask.apply_mask
234 torch.nn.utils.prune.CustomFromMask.prune
235 torch.nn.utils.prune.CustomFromMask.remove
236 torch.nn.utils.prune.identity
237 torch.nn.utils.prune.random_unstructured
238 torch.nn.utils.prune.l1_unstructured
239 torch.nn.utils.prune.random_structured
240 torch.nn.utils.prune.ln_structured
241 torch.nn.utils.prune.global_unstructured
242 torch.nn.utils.prune.custom_from_mask
243 torch.nn.utils.prune.remove
244 torch.nn.utils.prune.is_pruned
245 torch.nn.utils.weight_norm
246 torch.nn.utils.remove_weight_norm
247 torch.nn.utils.spectral_norm
248 torch.nn.utils.remove_spectral_norm
249 torch.nn.utils.rnn.PackedSequence
250 torch.nn.utils.rnn.pack_padded_sequence
251 torch.nn.utils.rnn.pad_packed_sequence
252 torch.nn.utils.rnn.pad_sequence
253 torch.nn.utils.rnn.pack_sequence
254 torch.nn.Flatten
255 torch.quantization.quantize
256 torch.quantization.quantize_dynamic
257 torch.quantization.quantize_qat
258 torch.quantization.prepare
259 torch.quantization.prepare_qat
260 torch.quantization.convert
261 torch.quantization.QConfig
262 torch.quantization.QConfigDynamic
263 torch.quantization.fuse_modules
264 torch.quantization.QuantStub
265 torch.quantization.DeQuantStub
266 torch.quantization.QuantWrapper
267 torch.quantization.add_quant_dequant
268 torch.quantization.add_observer_
269 torch.quantization.swap_module
270 torch.quantization.propagate_qconfig_
271 torch.quantization.default_eval_fn
272 torch.quantization.MinMaxObserver
273 torch.quantization.MovingAverageMinMaxObserver
274 torch.quantization.PerChannelMinMaxObserver
275 torch.quantization.MovingAveragePerChannelMinMaxObserver
276 torch.quantization.HistogramObserver
277 torch.quantization.FakeQuantize
278 torch.quantization.NoopObserver
279 torch.quantization.get_observer_dict
280 torch.quantization.RecordingObserver
281 torch.nn.intrinsic.ConvBn2d
282 torch.nn.intrinsic.ConvBnReLU2d
283 torch.nn.intrinsic.ConvReLU2d
284 torch.nn.intrinsic.ConvReLU3d
285 torch.nn.intrinsic.LinearReLU
286 torch.nn.intrinsic.qat.ConvBn2d
287 torch.nn.intrinsic.qat.ConvBnReLU2d
288 torch.nn.intrinsic.qat.ConvReLU2d
289 torch.nn.intrinsic.qat.LinearReLU
290 torch.nn.intrinsic.quantized.ConvReLU2d
291 torch.nn.intrinsic.quantized.ConvReLU3d
292 torch.nn.intrinsic.quantized.LinearReLU
293 torch.nn.qat.Conv2d
294 torch.nn.qat.Conv2d.from_float
295 torch.nn.qat.Linear
296 torch.nn.qat.Linear.from_float
297 torch.nn.quantized.functional.relu
298 torch.nn.quantized.functional.linear
299 torch.nn.quantized.functional.conv2d
300 torch.nn.quantized.functional.conv3d
301 torch.nn.quantized.functional.max_pool2d
302 torch.nn.quantized.functional.adaptive_avg_pool2d
303 torch.nn.quantized.functional.avg_pool2d
304 torch.nn.quantized.functional.interpolate
305 torch.nn.quantized.functional.upsample
306 torch.nn.quantized.functional.upsample_bilinear
307 torch.nn.quantized.functional.upsample_nearest
308 torch.nn.quantized.ReLU
309 torch.nn.quantized.ReLU6
310 torch.nn.quantized.Conv2d
311 torch.nn.quantized.Conv2d.from_float
312 torch.nn.quantized.Conv3d
313 torch.nn.quantized.Conv3d.from_float
314 torch.nn.quantized.FloatFunctional
315 torch.nn.quantized.QFunctional
316 torch.nn.quantized.Quantize
317 torch.nn.quantized.DeQuantize
318 torch.nn.quantized.Linear
319 torch.nn.quantized.Linear.from_float
320 torch.nn.quantized.dynamic.Linear
321 torch.nn.quantized.dynamic.Linear.from_float
322 torch.nn.quantized.dynamic.LSTM

Functions(torch.nn.functional)

序号 API名称 是否支持
1 torch.nn.functional.conv1d
2 torch.nn.functional.conv2d
3 torch.nn.functional.conv3d
4 torch.nn.functional.conv_transpose1d
5 torch.nn.functional.conv_transpose2d
6 torch.nn.functional.conv_transpose3d
7 torch.nn.functional.unfold
8 torch.nn.functional.fold
9 torch.nn.functional.avg_pool1d
10 torch.nn.functional.avg_pool2d
11 torch.nn.functional.avg_pool3d
12 torch.nn.functional.max_pool1d
13 torch.nn.functional.max_pool2d
14 torch.nn.functional.max_pool3d
15 torch.nn.functional.max_unpool1d
16 torch.nn.functional.max_unpool2d
17 torch.nn.functional.max_unpool3d
18 torch.nn.functional.lp_pool1d
19 torch.nn.functional.lp_pool2d
20 torch.nn.functional.adaptive_max_pool1d
21 torch.nn.functional.adaptive_max_pool2d
22 torch.nn.functional.adaptive_max_pool3d
23 torch.nn.functional.adaptive_avg_pool1d
24 torch.nn.functional.adaptive_avg_pool2d
25 torch.nn.functional.adaptive_avg_pool3d 是,仅支持D=1,H=1,W=1场景
26 torch.nn.functional.threshold
27 torch.nn.functional.threshold_
28 torch.nn.functional.relu
29 torch.nn.functional.relu_
30 torch.nn.functional.hardtanh
31 torch.nn.functional.hardtanh_
32 torch.nn.functional.relu6
33 torch.nn.functional.elu
34 torch.nn.functional.elu_
35 torch.nn.functional.selu
36 torch.nn.functional.celu
37 torch.nn.functional.leaky_relu
38 torch.nn.functional.leaky_relu_
39 torch.nn.functional.prelu
40 torch.nn.functional.rrelu
41 torch.nn.functional.rrelu_
42 torch.nn.functional.glu
43 torch.nn.functional.gelu
44 torch.nn.functional.logsigmoid
45 torch.nn.functional.hardshrink
46 torch.nn.functional.tanhshrink
47 torch.nn.functional.softsign
48 torch.nn.functional.softplus
49 torch.nn.functional.softmin
50 torch.nn.functional.softmax
51 torch.nn.functional.softshrink
52 torch.nn.functional.gumbel_softmax
53 torch.nn.functional.log_softmax
54 torch.nn.functional.tanh
55 torch.nn.functional.sigmoid
56 torch.nn.functional.batch_norm
57 torch.nn.functional.instance_norm
58 torch.nn.functional.layer_norm
59 torch.nn.functional.local_response_norm
60 torch.nn.functional.normalize
61 torch.nn.functional.linear
62 torch.nn.functional.bilinear
63 torch.nn.functional.dropout
64 torch.nn.functional.alpha_dropout
65 torch.nn.functional.dropout2d
66 torch.nn.functional.dropout3d
67 torch.nn.functional.embedding
68 torch.nn.functional.embedding_bag
69 torch.nn.functional.one_hot
70 torch.nn.functional.pairwise_distance
71 torch.nn.functional.cosine_similarity
72 torch.nn.functional.pdist
73 torch.nn.functional.binary_cross_entropy
74 torch.nn.functional.binary_cross_entropy_with_logits
75 torch.nn.functional.poisson_nll_loss
76 torch.nn.functional.cosine_embedding_loss
77 torch.nn.functional.cross_entropy
78 torch.nn.functional.ctc_loss 是(仅支持2维输入)
79 torch.nn.functional.hinge_embedding_loss
80 torch.nn.functional.kl_div
81 torch.nn.functional.l1_loss
82 torch.nn.functional.mse_loss
83 torch.nn.functional.margin_ranking_loss
84 torch.nn.functional.multilabel_margin_loss
85 torch.nn.functional.multilabel_soft_margin_loss
86 torch.nn.functional.multi_margin_loss
87 torch.nn.functional.nll_loss
88 torch.nn.functional.smooth_l1_loss
89 torch.nn.functional.soft_margin_loss
90 torch.nn.functional.triplet_margin_loss
91 torch.nn.functional.pixel_shuffle
92 torch.nn.functional.pad
93 torch.nn.functional.interpolate
94 torch.nn.functional.upsample
95 torch.nn.functional.upsample_nearest
96 torch.nn.functional.upsample_bilinear
97 torch.nn.functional.grid_sample
98 torch.nn.functional.affine_grid
99 torch.nn.parallel.data_parallel

torch.distributed

序号 API名称 是否支持
1 torch.distributed.init_process_group
2 torch.distributed.Backend
3 torch.distributed.get_backend
4 torch.distributed.get_rank
5 torch.distributed.get_world_size
6 torch.distributed.is_initialized
7 torch.distributed.is_mpi_available
8 torch.distributed.is_nccl_available
9 torch.distributed.new_group
10 torch.distributed.send
11 torch.distributed.recv
12 torch.distributed.isend
13 torch.distributed.irecv
14 is_completed
15 wait
16 torch.distributed.broadcast
17 torch.distributed.all_reduce
18 torch.distributed.reduce
19 torch.distributed.all_gather
20 torch.distributed.gather
21 torch.distributed.scatter
22 torch.distributed.barrier
23 torch.distributed.ReduceOp
24 torch.distributed.reduce_op
25 torch.distributed.broadcast_multigpu
26 torch.distributed.all_reduce_multigpu
27 torch.distributed.reduce_multigpu
28 torch.distributed.all_gather_multigpu
29 torch.distributed.launch
30 torch.multiprocessing.spawn

torch.npu

序号 API名称 npu对应API名称 是否支持
1 torch.cuda.current_blas_handle torch.npu.current_blas_handle
2 torch.cuda.current_device torch.npu.current_device
3 torch.cuda.current_stream torch.npu.current_stream
4 torch.cuda.default_stream torch.npu.default_stream
5 torch.cuda.device torch.npu.device
6 torch.cuda.device_count torch.npu.device_count
7 torch.cuda.device_of torch.npu.device_of
8 torch.cuda.get_device_capability torch.npu.get_device_capability
9 torch.cuda.get_device_name torch.npu.get_device_name
10 torch.cuda.init torch.npu.init
11 torch.cuda.ipc_collect torch.npu.ipc_collect
12 torch.cuda.is_available torch.npu.is_available
13 torch.cuda.is_initialized torch.npu.is_initialized
14 torch.cuda.set_device torch.npu.set_device 部分支持
15 torch.cuda.stream torch.npu.stream
16 torch.cuda.synchronize torch.npu.synchronize
17 torch.cuda.get_rng_state torch.npu.get_rng_state
18 torch.cuda.get_rng_state_all torch.npu.get_rng_state_all
19 torch.cuda.set_rng_state torch.npu.set_rng_state
20 torch.cuda.set_rng_state_all torch.npu.set_rng_state_all
21 torch.cuda.manual_seed torch.npu.manual_seed
22 torch.cuda.manual_seed_all torch.npu.manual_seed_all
23 torch.cuda.seed torch.npu.seed
24 torch.cuda.seed_all torch.npu.seed_all
25 torch.cuda.initial_seed torch.npu.initial_seed
26 torch.cuda.comm.broadcast torch.npu.comm.broadcast
27 torch.cuda.comm.broadcast_coalesced torch.npu.comm.broadcast_coalesced
28 torch.cuda.comm.reduce_add torch.npu.comm.reduce_add
29 torch.cuda.comm.scatter torch.npu.comm.scatter
30 torch.cuda.comm.gather torch.npu.comm.gather
31 torch.cuda.Stream torch.npu.Stream
32 torch.cuda.Stream.query torch.npu.Stream.query
33 torch.cuda.Stream.record_event torch.npu.Stream.record_event
34 torch.cuda.Stream.synchronize torch.npu.Stream.synchronize
35 torch.cuda.Stream.wait_event torch.npu.Stream.wait_event
36 torch.cuda.Stream.wait_stream torch.npu.Stream.wait_stream
37 torch.cuda.Event torch.npu.Event
38 torch.cuda.Event.elapsed_time torch.npu.Event.elapsed_time
39 torch.cuda.Event.from_ipc_handle torch.npu.Event.from_ipc_handle
40 torch.cuda.Event.ipc_handle torch.npu.Event.ipc_handle
41 torch.cuda.Event.query torch.npu.Event.query
42 torch.cuda.Event.record torch.npu.Event.record
43 torch.cuda.Event.synchronize torch.npu.Event.synchronize
44 torch.cuda.Event.wait torch.npu.Event.wait
45 torch.cuda.empty_cache torch.npu.empty_cache
46 torch.cuda.memory_stats torch.npu.memory_stats
47 torch.cuda.memory_summary torch.npu.memory_summary
48 torch.cuda.memory_snapshot torch.npu.memory_snapshot
49 torch.cuda.memory_allocated torch.npu.memory_allocated
50 torch.cuda.max_memory_allocated torch.npu.max_memory_allocated
51 torch.cuda.reset_max_memory_allocated torch.npu.reset_max_memory_allocated
52 torch.cuda.memory_reserved torch.npu.memory_reserved
53 torch.cuda.max_memory_reserved torch.npu.max_memory_reserved
54 torch.cuda.memory_cached torch.npu.memory_cached
55 torch.cuda.max_memory_cached torch.npu.max_memory_cached
56 torch.cuda.reset_max_memory_cached torch.npu.reset_max_memory_cached
57 torch.cuda.nvtx.mark torch.npu.nvtx.mark
58 torch.cuda.nvtx.range_push torch.npu.nvtx.range_push
59 torch.cuda.nvtx.range_pop torch.npu.nvtx.range_pop
60 torch.cuda._sleep torch.npu._sleep
61 torch.cuda.Stream.priority_range torch.npu.Stream.priority_range
62 torch.cuda.get_device_properties torch.npu.get_device_properties
63 torch.cuda.amp.GradScaler torch.npu.amp.GradScaler

torch.npu.set_device()接口只支持在程序开始的位置通过set_device进行指定,不支持多次指定和with torch.npu.device(id)方式的device切换

NPU自定义算子

序号 算子名称
1 npu_convolution_transpose
2 npu_conv_transpose2d
3 npu_convolution_transpose_backward
4 npu_conv_transpose2d_backward
5 npu_conv_transpose3d_backward
6 npu_convolution
7 npu_convolution_backward
8 npu_convolution_double_backward
9 npu_conv2d
10 npu_conv2d.out
11 npu_conv2d_backward
12 npu_conv3d
13 npu_conv3d.out
14 npu_conv3d_backward
15 one_
16 npu_sort_v2.out
17 npu_sort_v2
18 npu_format_cast
19 npu_format_cast_.acl_format
20 npu_format_cast_.src
21 npu_transpose_to_contiguous
22 npu_transpose
23 npu_transpose.out
24 npu_broadcast
25 npu_broadcast.out
26 npu_dtype_cast
27 npu_dtype_cast_.Tensor
28 npu_roi_alignbk
29 empty_with_format
30 empty_with_format.names
31 copy_memory_
32 npu_one_hot
33 npu_stride_add
34 npu_softmax_cross_entropy_with_logits
35 npu_softmax_cross_entropy_with_logits_backward
36 npu_ps_roi_pooling
37 npu_ps_roi_pooling_backward
38 npu_roi_align
39 npu_nms_v4
40 npu_lstm
41 npu_lstm_backward
42 npu_iou
43 npu_ptiou
44 npu_nms_with_mask
45 npu_pad
46 npu_bounding_box_encode
47 npu_bounding_box_decode
48 npu_gru
49 npu_gru_backward
50 npu_set_.source_Storage_storage_offset_format
51 npu_random_choice_with_mask
52 npu_batch_nms
53 npu_slice
54 npu_slice.out
55 npu_dropoutV2
56 npu_dropoutV2_backward
57 _npu_dropout
58 _npu_dropout_inplace
59 npu_dropout_backward
60 npu_indexing
61 npu_indexing.out
62 npu_ifmr
63 npu_max.dim
64 npu_max.names_dim
65 npu_scatter
66 npu_max_backward
67 npu_apply_adam
68 npu_layer_norm_eval
69 npu_alloc_float_status
70 npu_get_float_status
71 npu_clear_float_status
72 npu_confusion_transpose
73 npu_confusion_transpose_backward
74 npu_bmmV2
75 fast_gelu
76 fast_gelu_backward
77 npu_sub_sample
78 npu_deformable_conv2d
79 npu_deformable_conv2dbk
80 npu_mish
81 npu_anchor_response_flags
82 npu_yolo_boxes_encode
83 npu_grid_assign_positive
84 npu_mish_backward
85 npu_normalize_batch
86 npu_masked_fill_range
87 npu_linear
88 npu_linear_backward
89 npu_bert_apply_adam
90 npu_giou
91 npu_giou_backward

详细算子接口说明:

npu_apply_adam(beta1_power, beta2_power, lr, beta1, beta2, epsilon, grad, use_locking, use_nesterov, out = (var, m, v))

count adam result.

  • Parameters:

    • beta1_power (Number) - power of beta1.
    • beta2_power (Number) - power of beta2.
    • lr (Number) - learning rate.
    • beta1 (Number) - exponential decay rate for the 1st moment estimates.
    • beta2 (Number) - exponential decay rate for the 2nd moment estimates.
    • epsilon (Number) - term added to the denominator to improve numerical stability.
    • grad (Tensor) - the gradient.
    • use_locking (bool) - If True use locks for update operations.
    • use_nesterov (bool) -If True, uses the nesterov update.
    • var (Tensor) - variables to be optimized.
    • m (Tensor) - mean value of variables.
    • v (Tensor) - variance of variables.
  • constraints:

    None

  • Examples:

    None

npu_convolution_transpose(input, weight, bias, padding, output_padding, stride, dilation, groups) -> Tensor

Applies a 2D or 3D transposed convolution operator over an input image composed of several input planes, sometimes also called “deconvolution”.

  • Parameters:

    • input (Tensor) - input tensor of shape(minibatch, in_channels, iH, iW) or (minibatch, in_channels, iT, iH, iW)
    • weight (Tensor) - filters of shape(in_channels, out_channels/groups, kH, kW) or (in_channels, out_channels/groups, kT, kH, kW)
    • bias (Tensor, optional) - optional bias of shape(out_channels)
    • padding (ListInt) - (dilation * (kernel_size - 1) - padding) zero-padding will be added to both sides of each dimension in the input
    • output_padding (ListInt) - additional size added to one side of each dimension in the output shape.
    • stride (ListInt) - the stride of the convolving kernel
    • dilation (ListInt) - the spacing between kernel elements
    • groups (Number) - split input into groups, in_channels should be divisible by the number of groups
  • constraints:

    None

  • Examples:

    None

npu_conv_transpose2d(input, weight, bias, padding, output_padding, stride, dilation, groups) -> Tensor

Applies a 2D transposed convolution operator over an input image composed of several input planes, sometimes also called “deconvolution”.

  • Parameters:

    • input (Tensor) - input tensor of shape(minibatch, in_channels, iH, iW)
    • weight (Tensor) - filters of shape(in_channels, out_channels/groups, kH, kW)
    • bias (Tensor, optional) - optional bias of shape(out_channels)
    • padding (ListInt) - (dilation * (kernel_size - 1) - padding) zero-padding will be added to both sides of each dimension in the input
    • output_padding (ListInt) - additional size added to one side of each dimension in the output shape.
    • stride (ListInt) - the stride of the convolving kernel
    • dilation (ListInt) - the spacing between kernel elements
    • groups (Number) - split input into groups, in_channels should be divisible by the number of groups
  • constraints:

    None

  • Examples:

    None

npu_convolution(input, weight, bias, stride, padding, dilation, groups) -> Tensor

Applies a 2D or 3D convolution over an input image composed of several input planes.

  • Parameters:

    • input (Tensor) - input tensor of shape(minibatch, in_channels, iH, iW) or (minibatch, in_channels, iT, iH, iW)
    • weight (Tensor) - filters of shape(out_channels, in_channels/groups, kH, kW) or (out_channels, in_channels/groups, kT, kH, kW)
    • bias (Tensor, optional) - optional bias of shape(out_channels)
    • stride (ListInt) - the stride of the convolving kernel
    • padding (ListInt) - implicit paddings on both sides of the input
    • dilation (ListInt) - the spacing between kernel elements
    • groups (ListInt) - split input into groups, in_channels should be divisible by the number of groups
  • constraints:

    None

  • Examples:

    None

npu_conv2d(input, weight, bias, stride, padding, dilation, groups) -> Tensor

Applies a 2D convolution over an input image composed of several input planes.

  • Parameters:

    • input (Tensor) - input tensor of shape(minibatch, in_channels, iH, iW)
    • weight (Tensor) - filters of shape(out_channels, in_channels/groups, kH, kW)
    • bias (Tensor, optional) - optional bias of shape(out_channels)
    • stride (ListInt) - the stride of the convolving kernel
    • padding (ListInt) - implicit paddings on both sides of the input
    • dilation (ListInt) - the spacing between kernel elements
    • groups (ListInt) - split input into groups, in_channels should be divisible by the number of groups
  • constraints:

    None

  • Examples:

    None

npu_conv3d(input, weight, bias, stride, padding, dilation, groups) -> Tensor

Applies a 3D convolution over an input image composed of several input planes.

  • Parameters:

    • input (Tensor) - input tensor of shape(minibatch, in_channels, iT, iH, iW)
    • weight (Tensor) - filters of shape(out_channels, in_channels/groups, kT, kH, kW)
    • bias (Tensor, optional) - optional bias of shape(out_channels)
    • stride (ListInt) - the stride of the convolving kernel
    • padding (ListInt) - implicit paddings on both sides of the input
    • dilation (ListInt) - the spacing between kernel elements
    • groups (ListInt) - split input into groups, in_channels should be divisible by the number of groups
  • constraints:

    None

  • Examples:

    None

one_(self) -> Tensor

Fills self tensor with ones.

  • Parameters:

  • self (Tensor) - input tensor

  • constraints:

    None

  • Examples:

    >>> x = torch.rand(2, 3).npu()
    >>> x
    tensor([[0.6072, 0.9726, 0.3475],
            [0.3717, 0.6135, 0.6788]], device='npu:0')
    >>> x.one_()
    tensor([[1., 1., 1.],
            [1., 1., 1.]], device='npu:0')

npu_sort_v2(self, dim=-1, descending=False, out=None) -> Tensor

Sorts the elements of the input tensor along a given dimension in ascending order by value without indices. If dim is not given, the last dimension of the input is chosen. If descending is True then the elements are sorted in descending order by value.

  • Parameters:

    • self (Tensor) - the input tensor
    • dim (int, optional) - the dimension to sort along
    • descending (bool, optional) - controls the sorting order (ascending or descending)
    • out (Tensor, optional) - the output that can be optionally given to be used as output buffers
  • constraints:

    At present only support the last dim(-1).

  • Examples:

    >>> x = torch.randn(3, 4).npu()
    >>> x
    tensor([[-0.0067,  1.7790,  0.5031, -1.7217],
            [ 1.1685, -1.0486, -0.2938,  1.3241],
            [ 0.1880, -2.7447,  1.3976,  0.7380]], device='npu:0')
    >>> sorted_x = torch.npu_sort_v2(x)
    >>> sorted_x
    tensor([[-1.7217, -0.0067,  0.5029,  1.7793],
            [-1.0488, -0.2937,  1.1689,  1.3242],
            [-2.7441,  0.1880,  0.7378,  1.3975]], device='npu:0')

npu_format_cast(self, acl_format) -> Tensor

Change the format of a npu tensor.

  • Parameters:

    • self (Tensor) - the input tensor
    • acl_format (int) - the target format to transform
  • constraints:

    None

  • Examples:

    >>> x = torch.rand(2, 3, 4, 5).npu()
    >>> x.storage().npu_format()
    0
    >>> x1 = x.npu_format_cast(29)
    >>> x1.storage().npu_format()
    29

npu_format_cast_

npu_format_cast_.acl_format(self, acl_format) -> Tensor

In-place version of npu_format_cast()

npu_format_cast_.src(self, src) -> Tensor

In-place Change the format of self, with the same format as src.

  • Parameters:

    • self (Tensor) - the input tensor
    • src (Tensor) - the target format to transform
  • constraints:

    None

  • Examples:

    >>> x = torch.rand(2, 3, 4, 5).npu()
    >>> x.storage().npu_format()
    0
    >>> x.npu_format_cast_(29).storage().npu_format()
    29

npu_transpose(self, perm) -> Tensor

Returns a view of the original tensor with its dimensions permuted, and make the result contiguous.

  • Parameters:

    • self (Tensor) - the input tensor
    • perm (ListInt) - The desired ordering of dimensions
  • constraints:

    None

  • Examples:

    >>> x = torch.randn(2, 3, 5).npu()
    >>> x.shape
    torch.Size([2, 3, 5])
    >>> x1 = torch.npu_transpose(x, (2, 0, 1))
    >>> x1.shape
    torch.Size([5, 2, 3])
    >>> x2 = x.npu_transpose(2, 0, 1)
    >>> x2.shape
    torch.Size([5, 2, 3])

npu_broadcast(self, perm) -> Tensor

Returns a new view of the self tensor with singleton dimensions expanded to a larger size, and make the result contiguous.

Tensor can be also expanded to a larger number of dimensions, and the new ones will be appended at the front.

  • Parameters:

    • self (Tensor) - the input tensor
    • perm (ListInt) - the desired expanded size
  • constraints:

    None

  • Examples:

    >>> x = torch.tensor([[1], [2], [3]]).npu()
    >>> x.shape
    torch.Size([3, 1])
    >>> x.npu_broadcast(3, 4)
    tensor([[1, 1, 1, 1],
            [2, 2, 2, 2],
            [3, 3, 3, 3]], device='npu:0')

npu_dtype_cast(input, dtype) -> Tensor

Performs Tensor dtype conversion.

  • Parameters:

    • input (Tensor) - the input tensor.
    • dtype (torch.dtype) - the desired data type of returned Tensor.
  • constraints:

    None

  • Examples:

    >>> torch. npu_dtype_cast (torch.tensor([0, 0.5, -1.]).npu(), dtype=torch.int)
    tensor([ 0,  0, -1], device='npu:0', dtype=torch.int32)

empty_with_format(size, dtype, layout, device, pin_memory, acl_format) -> Tensor

Returns a tensor filled with uninitialized data. The shape of the tensor is defined by the variable argument size. The format of the tensor is defined by the variable argument acl_format.

  • Parameters:

    • size (int...) – a sequence of integers defining the shape of the output tensor. Can be a variable number of arguments or a collection like a list or tuple.

    • dtype (torch.dtype, optional) – the desired data type of returned tensor. Default: if None, uses a global default (see torch.set_default_tensor_type()).

    • layout (torch.layout, optional) – the desired layout of returned Tensor. Default: None.

    • device (torch.device, optional) – the desired device of returned tensor. Default: None

    • pin_memory (bool, optional) – If set, returned tensor would be allocated in the pinned memory. Default: None.

    • acl_format (Number) – the desired memory format of returned Tensor. Default: 2.

  • constraints:

    None

  • Examples:

    >>> torch.empty_with_format((2, 3), dtype=torch.float32, device="npu")
    tensor([[1., 1., 1.],
            [1., 1., 1.]], device='npu:0')

copy_memory_(dst, src, non_blocking=False) -> Tensor

Copies the elements from src into self tensor and returns self.

  • Parameters:

    • dst (Tensor) - the source tensor to copy from.
    • src (Tensor) - the desired data type of returned Tensor.
    • non_blocking (bool) - if True and this copy is between CPU and NPU, the copy may occur asynchronously with respect to the host. For other cases, this argument has no effect.
  • constraints:

    copy_memory_ only support npu tensor. input tensors of copy_memory_ should have same dtype. input tensors of copy_memory_ should have same device index.

  • Examples:

    >>> a=torch.IntTensor([0,  0, -1]).npu()
    >>> b=torch.IntTensor([1, 1, 1]).npu()
    >>> a.copy_memory_(b)
    tensor([1, 1, 1], device='npu:0', dtype=torch.int32)

npu_one_hot(input, num_classes=-1, depth=1, on_value=1, off_value=0) -> Tensor

Returns a one-hot tensor. The locations represented by index in "x" take value "on_value", while all other locations take value "off_value".

  • Parameters:

    • input (Tensor) - class values of any shape.
    • num_classes (Tensor) - The axis to fill. Defaults to "-1".
    • depth (Number) - The depth of the one hot dimension.
    • on_value (Number) - The value to fill in output when indices[j] = i.
    • off_value (Number) - The value to fill in output when indices[j] != i.
  • constraints:

    None

  • Examples:

    >>> a=torch.IntTensor([5, 3, 2, 1]).npu()
    >>> b=torch.npu_one_hot(a, depth=5)
    >>> b
    tensor([[0., 0., 0., 0., 0.],
            [0., 0., 0., 1., 0.],
            [0., 0., 1., 0., 0.],
            [0., 1., 0., 0., 0.]], device='npu:0')

npu_stride_add(x1, x2, offset1, offset2, c1_len) -> Tensor

Add the partial values of two tensors in format NC1HWC0.

  • Parameters:

    • x1 (Tensor) - A Tensor in 5HD.
    • x2 (Tensor) - A Tensor of the same type as "x1", and the same shape as "x1", except for the C1 value.
    • offset1 (Number) - A required int. Offset value of C1 in "x1".
    • offset2 (Number) - A required int. Offset value of C1 in "x2".
    • c1_len (Number) - A required int. C1 len of "y". The value must be less than the difference between C1 and offset in "x1" and "x2".
  • constraints:

    None

  • Examples:

    >>> a=torch.tensor([[[[[1.]]]]]).npu()
    >>> b=torch.npu_stride_add(a, a, 0, 0, 1)
    >>> b
    tensor([[[[[2.]]],
            [[[0.]]],
            [[[0.]]],
            [[[0.]]],
            [[[0.]]],
            [[[0.]]],
            [[[0.]]],
            [[[0.]]],
            [[[0.]]],
            [[[0.]]],
            [[[0.]]],
            [[[0.]]],
            [[[0.]]],
            [[[0.]]],
            [[[0.]]],
            [[[0.]]]]], device='npu:0')

npu_softmax_cross_entropy_with_logits(features, labels) -> Tensor

Computes softmax cross entropy cost.

  • Parameters:

    • features (Tensor) - A Tensor. A "batch_size * num_classes" matrix.
    • labels (Tensor) - A Tensor of the same type as "features". A "batch_size * num_classes" matrix.
  • constraints:

    None

  • Examples:

    None

npu_ps_roi_pooling(x, rois, spatial_scale, group_size, output_dim) -> Tensor

Performs Position Sensitive PS ROI Pooling.

  • Parameters:

    • x (Tensor) - An NC1HWC0 tensor, describing the feature map, dimension C1 must be equal to (int(output_dim+15)/C0))group_sizegroup_size.
    • rois (Tensor) - A tensor with shape [batch, 5, rois_num], describing the ROIs, each ROI consists of five elements: "batch_id", "x1", "y1", "x2", and "y2", which "batch_id" indicates the index of the input feature map, "x1", "y1", "x2", or "y2" must be greater than or equal to "0.0".
    • spatial_scale (Number) - A required float32, scaling factor for mapping the input coordinates to the ROI coordinates .
    • group_size (Number) - A required int32, specifying the number of groups to encode position-sensitive score maps, must be within the range (0, 128).
    • output_dim (Number) - A required int32, specifying the number of output channels, must be greater than 0.
  • constraints:

    None

  • Examples:

    >>> roi = torch.tensor([[[1], [2], [3], [4], [5]],
                            [[6], [7], [8], [9], [10]]], dtype = torch.float16).npu()
    >>> x = torch.tensor([[[[ 1]], [[ 2]], [[ 3]], [[ 4]],
                          [[ 5]], [[ 6]], [[ 7]], [[ 8]]],
                          [[[ 9]], [[10]], [[11]], [[12]],
                          [[13]], [[14]], [[15]], [[16]]]], dtype = torch.float16).npu()
    >>> out = torch.npu_ps_roi_pooling(x, roi, 0.5, 2, 2)
    >>> out
    tensor([[[[0., 0.],
              [0., 0.]],
            [[0., 0.],
              [0., 0.]]],
            [[[0., 0.],
              [0., 0.]],
            [[0., 0.],
              [0., 0.]]]], device='npu:0', dtype=torch.float16)

npu_roi_align(features, rois, spatial_scale, pooled_height, pooled_width, sample_num, roi_end_mode) -> Tensor

Obtains the ROI feature matrix from the feature map. It is a customized FasterRcnn operator.

  • Parameters:

    • features (Tensor) - A Tensor in 5HD.
    • rois (Tensor) - ROI position. A 2D Tensor with shape (N, 5). "N" indicates the number of ROIs, the value "5" indicates the indexes of images where the ROIs are located, "x0", "y0", "x1", and "y1".
    • spatial_scale (Number) - A required attribute of type float32, specifying the scaling ratio of "features" to the original image.
    • pooled_height (Number) - A required attribute of type int32, specifying the H dimension.
    • pooled_width (Number) - A required attribute of type int32, specifying the W dimension.
    • sample_num (Number) - An optional attribute of type int32, specifying the horizontal and vertical sampling frequency of each output. If this attribute is set to "0", the sampling frequency is equal to the rounded up value of "rois", which is a floating point number. Defaults to "2".
    • roi_end_mode (Number) - An optional attribute of type int32. Defaults to "1".
  • constraints:

    None

  • Examples:

    >>> x = torch.FloatTensor([[[[1, 2, 3 , 4, 5, 6],
                                [7, 8, 9, 10, 11, 12],
                                [13, 14, 15, 16, 17, 18],
                                [19, 20, 21, 22, 23, 24],
                                [25, 26, 27, 28, 29, 30],
                                [31, 32, 33, 34, 35, 36]]]]).npu()
    >>> rois = torch.tensor([[0, -2.0, -2.0, 22.0, 22.0]]).npu()
    >>> out = torch.npu_roi_align(x, rois, 0.25, 3, 3, 2, 0)
    >>> out
    tensor([[[[ 4.5000,  6.5000,  8.5000],
              [16.5000, 18.5000, 20.5000],
              [28.5000, 30.5000, 32.5000]]]], device='npu:0')

npu_nms_v4(boxes, scores, max_output_size, iou_threshold, scores_threshold, pad_to_max_output_size=False) -> (Tensor, Tensor)

Greedily selects a subset of bounding boxes in descending order of score.

  • Parameters:

    • boxes (Tensor) - A 2-D float tensor of shape [num_boxes, 4].
    • scores (Tensor) - A 1-D float tensor of shape [num_boxes] representing a single score corresponding to each box (each row of boxes).
    • max_output_size (Number) - A scalar representing the maximum number of boxes to be selected by non max suppression.
    • iou_threshold (Tensor) - A 0-D float tensor representing the threshold for deciding whether boxes overlap too much with respect to IOU.
    • scores_threshold (Tensor) - A 0-D float tensor representing the threshold for deciding when to remove boxes based on score.
    • pad_to_max_output_size (bool) - If true, the output selected_indices is padded to be of length max_output_size. Defaults to false.
  • Returns:

    • selected_indices - A 1-D integer tensor of shape [M] representing the selected indices from the boxes tensor, where M <= max_output_size.
    • valid_outputs - A 0-D integer tensor representing the number of valid elements in selected_indices, with the valid elements appearing first.
  • constraints:

    None

  • Examples:

    >>> boxes=torch.randn(100,4).npu()
    >>> scores=torch.randn(100).npu()
    >>> boxes.uniform_(0,100)
    >>> scores.uniform_(0,1)
    >>> max_output_size = 20
    >>> iou_threshold = torch.tensor(0.5).npu()
    >>> scores_threshold = torch.tensor(0.3).npu()
    >>> npu_output = torch.npu_nms_v4(boxes, scores, max_output_size, iou_threshold, scores_threshold)
    >>> npu_output
    (tensor([57, 65, 25, 45, 43, 12, 52, 91, 23, 78, 53, 11, 24, 62, 22, 67,  9, 94,
            54, 92], device='npu:0', dtype=torch.int32), tensor(20, device='npu:0', dtype=torch.int32))

npu_nms_rotated(dets, scores, iou_threshold, scores_threshold=0, max_output_size=-1, mode=0) -> (Tensor, Tensor)

Greedy selects a subset of the rotated bounding boxes in descending fractional order.

  • Parameters:

    • dets (Tensor) - A 2-D float tensor of shape [num_boxes, 5].
    • scores (Tensor) - A 1-D float tensor of shape [num_boxes] representing a single score corresponding to each box (each row of boxes).
    • iou_threshold (Number) - A scalar representing the threshold for deciding whether boxes overlap too much with respect to IOU.
    • scores_threshold (Number) - A scalar representing the threshold for deciding when to remove boxes based on score. Defaults to "0".
    • max_output_size (Number) - A scalar integer tensor representing the maximum number of boxes to be selected by non max suppression. Defaults to "-1", that is, no constraint is imposed.
    • mode (Number) - This parameter specifies the layout type of the dets. The default value is 0. If mode is set to 0, the input values of dets are x, y, w, h, and angle. If mode is set to 1, the input values of dets are x1, y1, x2, y2, and angle. Defaults to "0".
  • Returns:

    • selected_index - A 1-D integer tensor of shape [M] representing the selected indices from the dets tensor, where M <= max_output_size.
    • selected_num - A 0-D integer tensor representing the number of valid elements in selected_indices.
  • constraints:

    None

  • Examples:

    >>> dets=torch.randn(100,5).npu()
    >>> scores=torch.randn(100).npu()
    >>> dets.uniform_(0,100)
    >>> scores.uniform_(0,1)
    >>> output1, output2 = torch.npu_nms_rotated(dets, scores, 0.2, 0, -1, 1)
    >>> output1
    tensor([76, 48, 15, 65, 91, 82, 21, 96, 62, 90, 13, 59,  0, 18, 47, 23,  8, 56,
            55, 63, 72, 39, 97, 81, 16, 38, 17, 25, 74, 33, 79, 44, 36, 88, 83, 37,
            64, 45, 54, 41, 22, 28, 98, 40, 30, 20,  1, 86, 69, 57, 43,  9, 42, 27,
            71, 46, 19, 26, 78, 66,  3, 52], device='npu:0', dtype=torch.int32)
    >>> output2
    tensor([62], device='npu:0', dtype=torch.int32)

npu_lstm(x, weight, bias, seq_len, h, c, has_biases, num_layers, dropout, train, bidirectional, batch_first, flag_seq, direction)

DynamicRNN calculation.

  • Parameters:

    • x (Tensor) - A required 4D Tensor. Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
    • weight (Tensor) - A required 4D Tensor. Must be one of the following types: float16, float32. The format must be FRACTAL_ZN_LSTM.
    • bias (Tensor) - A required 1D Tensor. Must be one of the following types: float16, float32. The format must be ND.
    • seq_len (Tensor) - A optional Tensor. Only Support float16 in FRACTAL_NZ and int32 in ND.
    • h (Tensor) - A optional 4D Tensor. Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
    • c (Tensor) - A optional 4D Tensor. Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
    • has_biases (bool) - If the value is true, bias exists.
    • num_layers (Number) - Number of recurrent layers. Only Support single layer currently.
    • dropout (Number) - If non-zero, introduces a Dropout layer on the outputs of each LSTM layer except the last layer, with dropout probability equal to dropout. unsupport currently.
    • train (bool) - An bool identifying is training in the op. Default to true .
    • bidirectional (bool) - If True, becomes a bidirectional LSTM. unsupport currently.
    • batch_first (bool) - If True, then the input and output tensors are provided as (batch, seq, feature). unsupport currently.
    • flag_seq (bool) - If True, then the input is PackSequnce. unsupport currently.
    • direction (bool) - If True, then the direction is "REDIRECTIONAL", otherwise is "UNIDIRECTIONAL".
  • Returns:

    • y - A 4D Tensor. Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
    • output_h - A 4D Tensor. Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
    • output_c - A 4D Tensor. Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
    • i - A 4D Tensor. Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
    • j - A 4D Tensor. Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
    • f - A 4D Tensor. Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
    • o - A 4D Tensor. Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
    • tanhct - A 4D Tensor. Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
  • constraints:

    None

  • Examples:

    None

npu_iou(bboxes, gtboxes, mode=0) -> Tensor npu_ptiou(bboxes, gtboxes, mode=0) -> Tensor

Computes the intersection over union (iou) or the intersection over. foreground (iof) based on the ground-truth and predicted regions.

  • Parameters:

    • bboxes (Tensor) - the input tensor.
    • gtboxes (Tensor) - the input tensor.
    • mode (Number) - 0 1 corresponds to two modes iou iof.
  • constraints:

    None

  • Examples:

    >>> bboxes = torch.tensor([[0, 0, 10, 10],
                               [10, 10, 20, 20],
                               [32, 32, 38, 42]], dtype=torch.float16).to("npu")
    >>> gtboxes = torch.tensor([[0, 0, 10, 20],
                                [0, 10, 10, 10],
                                [10, 10, 20, 20]], dtype=torch.float16).to("npu")
    >>> output_iou = torch.npu_iou(bboxes, gtboxes, 0)
    >>> output_iou
    tensor([[0.4985, 0.0000, 0.0000],
            [0.0000, 0.0000, 0.0000],
            [0.0000, 0.9961, 0.0000]], device='npu:0', dtype=torch.float16)

npu_pad(input, paddings) -> Tensor

Pads a tensor

  • Parameters:

    • input (Tensor) - the input tensor.
    • paddings (ListInt) - type int32 or int64.
  • constraints:

    None

  • Examples:

    >>> input = torch.tensor([[20, 20, 10, 10]], dtype=torch.float16).to("npu")
    >>> paddings = [1, 1, 1, 1]
    >>> output = torch.npu_pad(input, paddings)
    >>> output
    tensor([[ 0.,  0.,  0.,  0.,  0.,  0.],
            [ 0., 20., 20., 10., 10.,  0.],
            [ 0.,  0.,  0.,  0.,  0.,  0.]], device='npu:0', dtype=torch.float16)

npu_nms_with_mask(input, iou_threshold) -> (Tensor, Tensor, Tensor)

The value 01 is generated for the nms operator to determine the valid bit

  • Parameters:

    • input (Tensor) - the input tensor.
    • iou_threshold (Number) - Threshold. If the value exceeds this threshold, the value is 1. Otherwise, the value is 0.
  • Returns:

    • selected_boxes - 2-D tensor with shape of [N,5], representing filtered boxes including proposal boxes and corresponding confidence scores.
    • selected_idx - 1-D tensor with shape of [N], representing the index of input proposal boxes.
    • selected_mask - 1-D tensor with shape of [N], the symbol judging whether the output proposal boxes is valid .
  • constraints:

    The 2nd-dim of input box_scores must be equal to 8.

  • Examples:

    >>> input = torch.tensor([[0.0, 1.0, 2.0, 3.0, 0.6], [6.0, 7.0, 8.0, 9.0, 0.4]], dtype=torch.float16).to("npu")
    >>> iou_threshold = 0.5
    >>> output1, output2, output3, = torch.npu_nms_with_mask(input, iou_threshold)
    >>> output1
    tensor([[0.0000, 1.0000, 2.0000, 3.0000, 0.6001],
            [6.0000, 7.0000, 8.0000, 9.0000, 0.3999]], device='npu:0',
          dtype=torch.float16)
    >>> output2
    tensor([0, 1], device='npu:0', dtype=torch.int32)
    >>> output3
    tensor([1, 1], device='npu:0', dtype=torch.uint8)

npu_bounding_box_encode(anchor_box, ground_truth_box, means0, means1, means2, means3, stds0, stds1, stds2, stds3) -> Tensor

Computes the coordinate variations between bboxes and ground truth boxes. It is a customized FasterRcnn operator

  • Parameters:

    • anchor_box (Tensor) - the input tensor.Anchor boxes. A 2D Tensor of float32 with shape (N, 4). "N" indicates the number of bounding boxes, and the value "4" refers to "x0", "x1", "y0", and "y1".
    • ground_truth_box (Tensor) - the input tensor.Ground truth boxes. A 2D Tensor of float32 with shape (N, 4). "N" indicates the number of bounding boxes, and the value "4" refers to "x0", "x1", "y0", and "y1"
    • means0 (Number) - An index of type int
    • means1 (Number) - An index of type int
    • means2 (Number) - An index of type int
    • means3 (Number) - An index of type int. Defaults to [0,0,0,0]. "deltas" = "deltas" x "stds" + "means".
    • stds0 (Number) - An index of type int
    • stds1 (Number) - An index of type int
    • stds2 (Number) - An index of type int
    • stds3 (Number) - An index of type int Defaults to [1.0,1.0,1.0,1.0]. "deltas" = "deltas" x "stds" + "means" .
  • constraints:

    None

  • Examples:

    >>> anchor_box = torch.tensor([[1., 2., 3., 4.], [3.,4., 5., 6.]], dtype = torch.float32).to("npu")
    >>> ground_truth_box = torch.tensor([[5., 6., 7., 8.], [7.,8., 9., 6.]], dtype = torch.float32).to("npu")
    >>> output = torch.npu_bounding_box_encode(anchor_box, ground_truth_box, 0, 0, 0, 0, 0.1, 0.1, 0.2, 0.2)
    >>> output
    tensor([[13.3281, 13.3281,  0.0000,  0.0000],
            [13.3281,  6.6641,  0.0000, -5.4922]], device='npu:0')
    >>>

npu_bounding_box_decode(rois, deltas, means0, means1, means2, means3, stds0, stds1, stds2, stds3, max_shape, wh_ratio_clip) -> Tensor

Generates bounding boxes based on "rois" and "deltas". It is a customized FasterRcnn operator .

  • Parameters:

    • rois (Tensor) - Region of interests (ROIs) generated by the region proposal network (RPN). A 2D Tensor of type float32 or float16 with shape (N, 4). "N" indicates the number of ROIs, and the value "4" refers to "x0", "x1", "y0", and "y1".
    • deltas (Tensor) - Absolute variation between the ROIs generated by the RPN and ground truth boxes. A 2D Tensor of type float32 or float16 with shape (N, 4). "N" indicates the number of errors, and 4 indicates "dx", "dy", "dw", and "dh" .
    • means0 (Number) - An index of type int
    • means1 (Number) - An index of type int
    • means2 (Number) - An index of type int
    • means3 (Number) - An index of type int. Defaults to [0,0,0,0]. "deltas" = "deltas" x "stds" + "means".
    • stds0 (Number) - An index of type int
    • stds1 (Number) - An index of type int
    • stds2 (Number) - An index of type int
    • stds3 (Number) - An index of type int Defaults to [1.0,1.0,1.0,1.0]. "deltas" = "deltas" x "stds" + "means" .
    • max_shape (ListInt) - Shape [h, w], specifying the size of the image transferred to the network. Used to ensure that the bbox shape after conversion does not exceed "max_shape
    • wh_ratio_clip (Number) - Defaults to "16/1000". The values of "dw" and "dh" fall within (-wh_ratio_clip, wh_ratio_clip) .
  • constraints:

    None

  • Examples:

    >>> rois = torch.tensor([[1., 2., 3., 4.], [3.,4., 5., 6.]], dtype = torch.float32).to("npu")
    >>> deltas = torch.tensor([[5., 6., 7., 8.], [7.,8., 9., 6.]], dtype = torch.float32).to("npu")
    >>> output = torch.npu_bounding_box_decode(rois, deltas, 0, 0, 0, 0, 1, 1, 1, 1, (10, 10), 0.1)
    >>> output
    tensor([[2.5000, 6.5000, 9.0000, 9.0000],
            [9.0000, 9.0000, 9.0000, 9.0000]], device='npu:0')

npu_gru(input, hx, weight_input, weight_hidden, bias_input, bias_hidden, seq_length, has_biases, num_layers, dropout, train, bidirectional, batch_first) -> (Tensor, Tensor, Tensor, Tensor, Tensor, Tensor)

DynamicGRUV2 calculation.

  • Parameters:

    • input (Tensor) - Must be one of the following types: float16. The format must be FRACTAL_NZ.
    • hx (Tensor) - Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
    • weight_input (Tensor) - Must be one of the following types: float16. The format must be FRACTAL_Z.
    • weight_hidden (Tensor) - Must be one of the following types: float16. The format must be FRACTAL_Z.
    • bias_input (Tensor) - Must be one of the following types: float16, float32. The format must be ND.
    • bias_hidden (Tensor) - Must be one of the following types: float16, float32. The format must be ND.
    • seq_length (Tensor) - Must be one of the following types: int32. The format must be ND.
    • has_biases (bool) - Default to true.
    • num_layers (Number)
    • dropout (Number)
    • train (bool) - An bool identifying is training in the op. Default to true.
    • bidirectional (bool) - Default to true.
    • batch_first (bool) - Default to true.
  • Returns:

    • y (Tensor) - Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
    • output_h (Tensor) - output_h:Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
    • update (Tensor) - update:Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
    • reset (Tensor) - reset:Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
    • new (Tensor) - Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
    • hidden_new (Tensor) - Must be one of the following types: float16, float32. The format must be FRACTAL_NZ.
  • constraints:

    None

  • Examples:

    None

npu_random_choice_with_mask(x, count=256, seed=0, seed2=0) -> (Tensor, Tensor)

Shuffle index of no-zero element

  • Parameters:

    • x (Tensor) - the input tensor.
    • count (Number) - the count of output, if 0, out all no-zero elements.
    • seed (Number) - type int32 or int64.
    • seed2 (Number) - type int32 or int64.
  • Returns:

    • y - 2-D tensor, no-zero element index.
    • mask - 1-D, whether the corresponding index is valid.
  • constraints:

    None

  • Examples:

    >>> x = torch.tensor([1, 0, 1, 0], dtype=torch.bool).to("npu")
    >>> result, mask = torch.npu_random_choice_with_mask(x, 2, 1, 0)
    >>> result
    tensor([[0],
            [2]], device='npu:0', dtype=torch.int32)
    >>> mask
    tensor([True, True], device='npu:0')

npu_batch_nms(self, scores, score_threshold, iou_threshold, max_size_per_class, max_total_size, change_coordinate_frame=False, transpose_box=False) -> (Tensor, Tensor, Tensor, Tensor)

Computes nms for input boxes and score, support multiple batch and classes. will do clip to window, score filter, top_k, and nms

  • Parameters:

    • self (Tensor) - the input tensor.
    • scores (Tensor) - the input tensor.
    • score_threshold (Number) - A required attribute of type float32, specifying the score filter iou iou_threshold.
    • iou_threshold (Number) - A required attribute of type float32, specifying the nms iou iou_threshold.
    • max_size_per_class (Number) - A required attribute of type int, specifying the nms output num per class.
    • max_total_size (Number) - A required attribute of type int, specifying the the nms output num per batch.
    • change_coordinate_frame (bool) - A optional attribute of type bool, whether to normalize coordinates after clipping.
    • transpose_box (bool) - A optional attribute of type bool, whether inserted transpose before this op. must be "false"
  • Returns:

    • nmsed_boxes (Tensor) - A 3D Tensor of type float16 with shape (batch, max_total_size, 4),specifying the output nms boxes per batch.
    • nmsed_scores (Tensor) - A 2D Tensor of type float16 with shape (batch, max_total_size),specifying the output nms score per batch.
    • nmsed_classes (Tensor) - A 2D Tensor of type float16 with shape (batch, max_total_size),specifying the output nms class per batch.
    • nmsed_num (Tensor) - A 1D Tensor of type int32 with shape (batch), specifying the valid num of nmsed_boxes.
  • constraints:

    None

  • Examples:

    >>> boxes = torch.randn(8, 2, 4, 4, dtype = torch.float32).to("npu")
    >>> scores = torch.randn(3, 2, 4, dtype = torch.float32).to("npu")
    >>> nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_num = torch.npu_batch_nms(boxes, scores, 0.3, 0.5, 3, 4)
    >>> nmsed_boxes
    >>> nmsed_scores
    >>> nmsed_classes
    >>> nmsed_num

npu_slice(self, offsets, size) -> Tensor

Extracts a slice from a tensor

  • Parameters:

    • self (Tensor) - the input tensor.
    • offsets (ListInt) - type int32 or int64.
    • size (ListInt) - type int32 or int64.
  • constraints:

    None

  • Examples:

    >>> input = torch.tensor([[1,2,3,4,5], [6,7,8,9,10]], dtype=torch.float16).to("npu")
    >>> offsets = [0, 0]
    >>> size = [2, 2]
    >>> output = torch.npu_slice(input, offsets, size)
    >>> output
    tensor([[1., 2.],
            [6., 7.]], device='npu:0', dtype=torch.float16)

npu_dropoutV2(self, seed, p) -> (Tensor, Tensor, Tensor(a!))

count dropout result with seed

  • Parameters:

    • self (Tensor) - The input Tensor.
    • seed (Tensor) - The input Tensor.
    • p (Float) - Dropout probability.
  • Returns:

    • y - A tensor with the same shape and type as "x".
    • mask - A tensor with the same shape and type as "x".
    • new_seed - A tensor with the same shape and type as "seed".
  • constraints:

    None

  • Examples:

    >>> input = torch.tensor([1.,2.,3.,4.]).npu()
    >>> input
    tensor([1., 2., 3., 4.], device='npu:0')
    >>> seed = torch.rand((32,),dtype=torch.float32).npu()
    >>> seed
    tensor([0.4368, 0.7351, 0.8459, 0.4657, 0.6783, 0.8914, 0.8995, 0.4401, 0.4408,
          0.4453, 0.2404, 0.9680, 0.0999, 0.8665, 0.2993, 0.5787, 0.0251, 0.6783,
          0.7411, 0.0670, 0.9430, 0.9165, 0.3983, 0.5849, 0.7722, 0.4659, 0.0486,
          0.2693, 0.6451, 0.2734, 0.3176, 0.0176], device='npu:0')
    >>> prob = 0.3
    >>> output, mask, out_seed = torch.npu_dropoutV2(input, seed, prob)
    >>> output
    tensor([0.4408, 0.4453, 0.2404, 0.9680], device='npu:0')
    >>> mask
    tensor([0., 0., 0., 0.], device='npu:0')
    >>> out_seed
    tensor([0.4408, 0.4453, 0.2404, 0.9680, 0.0999, 0.8665, 0.2993, 0.5787, 0.0251,
            0.6783, 0.7411, 0.0670, 0.9430, 0.9165, 0.3983, 0.5849, 0.7722, 0.4659,
            0.0486, 0.2693, 0.6451, 0.2734, 0.3176, 0.0176, 0.0000, 0.0000, 0.0000,
            0.0000, 0.0000, 0.0000, 0.0000, 0.0000], device='npu:0')

_npu_dropout(self, p) -> (Tensor, Tensor)

count dropout result without seed

  • Parameters: Similar to torch.dropout, optimize implemention to npu device.

    • self (Tensor) - The input Tensor.
    • p (Float) - Dropout probability.
  • constraints:

    None

  • Examples:

    >>> input = torch.tensor([1.,2.,3.,4.]).npu()
    >>> input
    tensor([1., 2., 3., 4.], device='npu:0')
    >>> prob = 0.3
    >>> output, mask = torch._npu_dropout(input, prob)
    >>> output
    tensor([0.0000, 2.8571, 0.0000, 0.0000], device='npu:0')
    >>> mask
    tensor([ 98, 255, 188, 186, 120, 157, 175, 159,  77, 223, 127,  79, 247, 151,
          253, 255], device='npu:0', dtype=torch.uint8)

_npu_dropout_inplace(result, p) -> (Tensor(a!), Tensor)

count dropout result inplace.

  • Parameters: Similar to torch.dropout_, optimize implemention to npu device.

    • result (Tensor) - The Tensor dropout inplace.
    • p (Float) - Dropout probability.
  • constraints:

    None

  • Examples:

    >>> input = torch.tensor([1.,2.,3.,4.]).npu()
    >>> input
    tensor([1., 2., 3., 4.], device='npu:0')
    >>> prob = 0.3
    >>> output, mask = torch._npu_dropout_inplace(input, prob)
    >>> output
    tensor([0.0000, 2.8571, 0.0000, 0.0000], device='npu:0')
    >>> input
    tensor([0.0000, 2.8571, 4.2857, 5.7143], device='npu:0')
    >>> mask
    tensor([ 98, 255, 188, 186, 120, 157, 175, 159,  77, 223, 127,  79, 247, 151,
          253, 255], device='npu:0', dtype=torch.uint8)

npu_indexing(self, begin, end, strides, begin_mask=0, end_mask=0, ellipsis_mask=0, new_axis_mask=0, shrink_axis_mask=0) -> Tensor

count indexing result by begin,end,strides array.

  • Parameters:

    • self (Tensor) - A Input Tensor.
    • begin (ListInt) - The index of the first value to select.
    • end (ListInt) - The index of the last value to select.
    • strides (ListInt) - The index increment.
    • begin_mask (Number) - A bitmask where a bit "i" being "1" means to ignore the begin value and instead use the largest interval possible.
    • end_mask (Number) - Analogous to "begin_mask".
    • ellipsis_mask (Number) - A bitmask where bit "i" being "1" means the "i"th position is actually an ellipsis.
    • new_axis_mask (Number) - A bitmask where bit "i" being "1" means the "i"th specification creates a new shape 1 dimension.
    • shrink_axis_mask (Number) - A bitmask where bit "i" implies that the "i"th specification should shrink the dimensionality.
  • constraints:

    None

  • Examples:

    >>> input = torch.tensor([[1, 2, 3, 4],[5, 6, 7, 8]], dtype=torch.int32).to("npu")
    >>> input
    tensor([[1, 2, 3, 4],
          [5, 6, 7, 8]], device='npu:0', dtype=torch.int32)
    >>> output = torch.npu_indexing(input1, [0, 0], [2, 2], [1, 1])
    >>> output
    tensor([[1, 2],
          [5, 6]], device='npu:0', dtype=torch.int32)

npu_ifmr(Tensor data, Tensor data_min, Tensor data_max, Tensor cumsum, float min_percentile, float max_percentile, float search_start, float search_end, float search_step, bool with_offset) -> (Tensor, Tensor)

count ifmr result by begin,end,strides array, Input Feature Map Reconstruction

  • Parameters:

    • data (Tensor) - A Tensor of feature map.
    • data_min (Tensor) - A Tensor of min value of feature map.
    • data_max (Tensor) - A Tensor of max value of feature map.
    • cumsum (Tensor) - A Tensor of cumsum bin of data.
    • min_percentile (Float) - min init percentile.
    • max_percentile (Float) - max init percentile.
    • search_start (Float) - search start.
    • search_end (Float) - search end.
    • search_step (Float) - step size of searching.
    • with_offset (bool) - whether using offset.
  • Returns:

    • scale - optimal scale.
    • offset - optimal offset .
  • constraints:

    None

  • Examples:

    >>> input = torch.rand((2,2,3,4),dtype=torch.float32).npu()
    >>> input
    tensor([[[[0.4508, 0.6513, 0.4734, 0.1924],
              [0.0402, 0.5502, 0.0694, 0.9032],
              [0.4844, 0.5361, 0.9369, 0.7874]],
    
            [[0.5157, 0.1863, 0.4574, 0.8033],
              [0.5986, 0.8090, 0.7605, 0.8252],
              [0.4264, 0.8952, 0.2279, 0.9746]]],
    
            [[[0.0803, 0.7114, 0.8773, 0.2341],
              [0.6497, 0.0423, 0.8407, 0.9515],
              [0.1821, 0.5931, 0.7160, 0.4968]],
      
            [[0.7977, 0.0899, 0.9572, 0.0146],
              [0.2804, 0.8569, 0.2292, 0.1118],
              [0.5747, 0.4064, 0.8370, 0.1611]]]], device='npu:0')
      >>> min_value = torch.min(input)
      >>> min_value
      tensor(0.0146, device='npu:0')
      >>> max_value = torch.max(input)
      >>> max_value
      tensor(0.9746, device='npu:0')
      >>> hist = torch.histc(input.to('cpu'),
                             bins=128,
                             min=min_value.to('cpu'),
                             max=max_value.to('cpu'))
      >>> hist
      tensor([1., 0., 0., 2., 0., 0., 0., 1., 1., 0., 1., 0., 1., 0., 0., 0., 0., 0.,
              0., 1., 0., 0., 2., 1., 0., 0., 0., 0., 2., 1., 0., 0., 0., 0., 0., 1.,
              0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.,
              1., 0., 0., 0., 1., 1., 0., 1., 1., 0., 1., 0., 1., 0., 0., 1., 0., 1.,
              0., 0., 1., 0., 0., 2., 0., 0., 0., 0., 0., 0., 2., 0., 0., 0., 0., 0.,
              0., 0., 1., 1., 0., 0., 0., 0., 0., 1., 0., 0., 0., 1., 1., 2., 0., 0.,
              1., 1., 1., 0., 1., 0., 0., 1., 0., 1., 1., 0., 0., 0., 1., 0., 1., 1.,
              0., 1.])
      >>> cdf = torch.cumsum(hist,dim=0).int().npu()
      >>> cdf
      tensor([ 1,  1,  1,  3,  3,  3,  3,  4,  5,  5,  6,  6,  7,  7,  7,  7,  7,  7,
              7,  8,  8,  8, 10, 11, 11, 11, 11, 11, 13, 14, 14, 14, 14, 14, 14, 15,
              15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 16, 16,
              17, 17, 17, 17, 18, 19, 19, 20, 21, 21, 22, 22, 23, 23, 23, 24, 24, 25,
              25, 25, 26, 26, 26, 28, 28, 28, 28, 28, 28, 28, 30, 30, 30, 30, 30, 30,
              30, 30, 31, 32, 32, 32, 32, 32, 32, 33, 33, 33, 33, 34, 35, 37, 37, 37,
              38, 39, 40, 40, 41, 41, 41, 42, 42, 43, 44, 44, 44, 44, 45, 45, 46, 47,
              47, 48], device='npu:0', dtype=torch.int32)
      >>> scale, offset = torch.npu_ifmr(input,
                                         min_value,
                                         max_value,
                                         cdf,
                                         min_percentile=0.999999,
                                         max_percentile=0.999999,
                                         search_start=0.7,
                                         search_end=1.3,
                                         search_step=0.01,
                                         with_offset=False)
      >>> scale
      tensor(0.0080, device='npu:0')
      >>> offset
      tensor(0., device='npu:0')

npu_max.dim(self, dim, keepdim=False) -> (Tensor, Tensor)

count max result with dim.

  • Parameters: Similar to torch.max, optimize implemention to npu device.

    • self (Tensor) – the input tensor.
    • dim (Number) – the dimension to reduce.
    • keepdim (bool) – whether the output tensor has dim retained or not.
  • Returns:

    • values - max values in the input tensor.
    • indices - index of max values in the input tensor.
  • constraints:

    None

  • Examples:

    >>> input = torch.randn(2, 2, 2, 2, dtype = torch.float32).npu()
    >>> input
    tensor([[[[-1.8135,  0.2078],
              [-0.6678,  0.7846]],
    
            [[ 0.6458, -0.0923],
              [-0.2124, -1.9112]]],
    
            [[[-0.5800, -0.4979],
              [ 0.2580,  1.1335]],
      
            [[ 0.6669,  0.1876],
              [ 0.1160, -0.1061]]]], device='npu:0')
    >>> outputs, indices = torch.npu_max(input, 2)
    >>> outputs
    tensor([[[-0.6678,  0.7846],
            [ 0.6458, -0.0923]],
    
            [[ 0.2580,  1.1335],
            [ 0.6669,  0.1876]]], device='npu:0')
    >>> indices
    tensor([[[1, 1],
            [0, 0]],
    
            [[1, 1],
            [0, 0]]], device='npu:0', dtype=torch.int32)

npu_min.dim(self, dim, keepdim=False) -> (Tensor, Tensor)

count min result with dim.

  • Parameters: Similar to torch.min, optimize implemention to npu device.

    • self (Tensor) – the input tensor.
    • dim (Number) – the dimension to reduce.
    • keepdim (bool) – whether the output tensor has dim retained or not.
  • Returns:

    • values - min values in the input tensor.
    • indices - index of min values in the input tensor.
  • constraints:

    None

  • Examples:

    >>> input = torch.randn(2, 2, 2, 2, dtype = torch.float32).npu()
    >>> input
    tensor([[[[-0.9909, -0.2369],
              [-0.9569, -0.6223]],
    
            [[ 0.1157, -0.3147],
              [-0.7761,  0.1344]]],
    
            [[[ 1.6292,  0.5953],
              [ 0.6940, -0.6367]],
      
            [[-1.2335,  0.2131],
              [ 1.0748, -0.7046]]]], device='npu:0')
    >>> outputs, indices = torch.npu_min(input, 2)
    >>> outputs
    tensor([[[-0.9909, -0.6223],
            [-0.7761, -0.3147]],
    
            [[ 0.6940, -0.6367],
            [-1.2335, -0.7046]]], device='npu:0')
    >>> indices
    tensor([[[0, 1],
            [1, 0]],
    
            [[1, 1],
            [0, 1]]], device='npu:0', dtype=torch.int32)

npu_scatter(self, indices, updates, dim) -> Tensor

count scatter result with dim.

  • Parameters: Similar to torch.scatter, optimize implemention to npu device.

    • self (Tensor) - the input tensor.
    • indices (Tensor) – the indices of elements to scatter, can be either empty or of the same dimensionality as src. When empty, the operation returns self unchanged.
    • updates (Tensor) – the source element(s) to scatter.
  • dim (Number) – the axis along which to index

  • constraints:

    None

  • Examples:

    >>> input    = torch.tensor([[1.6279, 0.1226], [0.9041, 1.0980]]).npu()
    >>> input
    tensor([[1.6279, 0.1226],
            [0.9041, 1.0980]], device='npu:0')
    >>> indices  = torch.tensor([0, 1],dtype=torch.int32).npu()
    >>> indices
    tensor([0, 1], device='npu:0', dtype=torch.int32)
    >>> updates  = torch.tensor([-1.1993, -1.5247]).npu()
    >>> updates
    tensor([-1.1993, -1.5247], device='npu:0')
    >>> dim = 0
    >>> output = torch.npu_scatter(input, indices, updates, dim)
    >>> output
    tensor([[-1.1993,  0.1226],
            [ 0.9041, -1.5247]], device='npu:0')

npu_layer_norm_eval(input, normalized_shape, weight=None, bias=None, eps=1e-05) -> Tensor

count layer norm result.

  • Parameters: The same as torch.nn.functional.layer_norm, optimize implemention to npu device.

    • input (Tensor) - The input Tensor.
    • normalized_shape (ListInt) – input shape from an expected input of size.
    • weight (Tensor) - The gamma Tensor.
    • bias (Tensor) - The beta Tensor.
    • eps (Float) – The epsilon value added to the denominator for numerical stability. Default: 1e-5.
  • constraints:

    None

  • Examples:

    >>> input = torch.rand((6, 4), dtype=torch.float32).npu()
    >>> input
    tensor([[0.1863, 0.3755, 0.1115, 0.7308],
            [0.6004, 0.6832, 0.8951, 0.2087],
            [0.8548, 0.0176, 0.8498, 0.3703],
            [0.5609, 0.0114, 0.5021, 0.1242],
            [0.3966, 0.3022, 0.2323, 0.3914],
            [0.1554, 0.0149, 0.1718, 0.4972]], device='npu:0')
    >>> normalized_shape = input.size()[1:]
    >>> normalized_shape
    torch.Size([4])
    >>> weight = torch.Tensor(*normalized_shape).npu()
    >>> weight
    tensor([        nan,  6.1223e-41, -8.3159e-20,  9.1834e-41], device='npu:0')
    >>> bias = torch.Tensor(*normalized_shape).npu()
    >>> bias
    tensor([5.6033e-39, 6.1224e-41, 6.1757e-39, 6.1224e-41], device='npu:0')
    >>> output = torch.npu_layer_norm_eval(input, normalized_shape, weight, bias, 1e-5)
    >>> output
    tensor([[        nan,  6.7474e-41,  8.3182e-20,  2.0687e-40],
            [        nan,  8.2494e-41, -9.9784e-20, -8.2186e-41],
            [        nan, -2.6695e-41, -7.7173e-20,  2.1353e-41],
            [        nan, -1.3497e-41, -7.1281e-20, -6.9827e-42],
            [        nan,  3.5663e-41,  1.2002e-19,  1.4314e-40],
            [        nan, -6.2792e-42,  1.7902e-20,  2.1050e-40]], device='npu:0')

npu_alloc_float_status(self) -> Tensor

Produces eight numbers with a value of zero

  • Parameters:

    • self (Tensor) - Any Tensor
  • constraints:

    None

  • Examples:

    >>> input    = torch.randn([1,2,3]).npu()
    >>> output = torch.npu_alloc_float_status(input)
    >>> input
    tensor([[[ 2.2324,  0.2478, -0.1056],
            [ 1.1273, -0.2573,  1.0558]]], device='npu:0')
    >>> output
    tensor([0., 0., 0., 0., 0., 0., 0., 0.], device='npu:0')

npu_get_float_status(self) -> Tensor

Computes NPU get float status operator function.

  • Parameters:

    • self (Tensor) - A Tensor of data memory address. Must be float32 .
  • Constraints:

    None

  • Examples:

    >>> x = torch.rand(2).npu()
    >>> torch.npu_get_float_status(x)
    tensor([0., 0., 0., 0., 0., 0., 0., 0.], device='npu:0')

npu_clear_float_status(self) -> Tensor

Set the value of address 0x40000 to 0 in each core.

  • Parameters:

    • self (Tensor) - A tensor of type float32.
  • Constraints:

    None

  • Examples:

    >>> x = torch.rand(2).npu()
    >>> torch.npu_clear_float_status(x)
    tensor([0., 0., 0., 0., 0., 0., 0., 0.], device='npu:0')

npu_confusion_transpose(self, perm, shape, transpose_first) -> Tensor

Confuse reshape and transpose.

  • Parameters:

    • self (Tensor) - A Tensor. Must be one of the following types: float16, float32, int8, int16, int32, int64, uint8, uint16, uint32, uint64.
    • perm (ListInt) - A permutation of the dimensions of "x".
    • shape (ListInt) - The shape of the input.
    • transpose_first (bool) - If True, the transpose is first, otherwise the reshape is first.
  • Constraints:

    None

  • Examples:

    >>> x = torch.rand(2, 3, 4, 6).npu()
    >>> x.shape
    torch.Size([2, 3, 4, 6])
    >>> y = torch.npu_confusion_transpose(x, (0, 2, 1, 3), (2, 4, 18), True)
    >>> y.shape
    torch.Size([2, 4, 18])
    >>> y2 = torch.npu_confusion_transpose(x, (0, 2, 1), (2, 12, 6), False)
    >>> y2.shape
    torch.Size([2, 6, 12])

npu_bmmV2(self, mat2, output_sizes) -> Tensor

Multiplies matrix "a" by matrix "b", producing "a * b" .

  • Parameters:

    • self (Tensor) - A matrix Tensor. Must be one of the following types: float16, float32, int32. 2D or higher. Has format [ND, NHWC, FRACTAL_NZ].
    • mat2 (Tensor) - A matrix Tensor. Must be one of the following types: float16, float32, int32. 2D or higher. Has format [ND, NHWC, FRACTAL_NZ].
    • output_sizes (ListInt) - Output's shape, used in matmul's backpropagation, default [].
  • Constraints:

    None

  • Examples:

    >>> mat1 = torch.randn(10, 3, 4).npu()
    >>> mat2 = torch.randn(10, 4, 5).npu()
    >>> res = torch.npu_bmmV2(mat1, mat2, [])
    >>> res.shape
    torch.Size([10, 3, 5])

fast_gelu(self) -> Tensor

Computes the gradient for the fast_gelu of "x" .

  • Parameters:

    • self (Tensor) - A Tensor. Must be one of the following types: float16, float32
  • Constraints:

    None

  • Examples:

    >>> x = torch.rand(2).npu()
    >>> x
    tensor([0.5991, 0.4094], device='npu:0')
    >>> torch.fast_gelu(x)
    tensor([0.4403, 0.2733], device='npu:0')

npu_sub_sample(self, per_images, positive_fraction) -> Tensor

Randomly sample a subset of positive and negative examples,and overwrite the label vector to the ignore value (-1) for all elements that are not included in the sample.

  • Parameters:

    • self (Tensor) - shape of labels,(N, ) label vector with values.
    • per_images (Number) - A require attribute of type int.
    • positive_fraction (Float) - A require attribute of type float.
  • Constraints:

    None

  • Examples:

    >>> x = torch.tensor([-2, 3, 6, -7, -2, 8, 1, -5, 7, 4]).int().npu()
    >>> x
    tensor([-2,  3,  6, -7, -2,  8,  1, -5,  7,  4], device='npu:0',
          dtype=torch.int32)
    >>> torch.npu_sub_sample(x, 5, 0.6)
    tensor([-1, -1, -1, -1, -1, -1,  1, -1, -1, -1], device='npu:0',
          dtype=torch.int32)

npu_deformable_conv2d(input, weight, offset, bias, kernel_size, stride, padding, dilation=[1,1,1,1], groups=1, deformable_groups=1, modulated=True) -> (Tensor, Tensor)

Computes the deformed convolution output with the expected input.

  • Parameters:

    • self (Tensor) - A 4D tensor of input image. With the format "NHWC", the data is stored in the order of: [batch, in_height, in_width, in_channels].
    • weight (Tensor) - A 4D tensor of learnable filters. Must have the same type as "x". With the format "HWCN" , the data is stored in the order of: [filter_height, filter_width, in_channels / groups, out_channels].
    • offset (Tensor) - A 4D tensor of x-y coordinates offset and mask. With the format "NHWC", the data is stored in the order of: [batch, out_height, out_width, deformable_groups * filter_height * filter_width * 3].
    • bias (Tensor) - An optional 1D tensor of additive biases to the filter outputs. The data is stored in the order of: [out_channels].
    • kernel_size (ListInt) - A tuple/list of 2 integers.kernel size.
    • stride (ListInt) - Required. A list of 4 integers. The stride of the sliding window for each dimension of input. The dimension order is interpreted according to the data format of "x". The N and C dimensions must be set to 1.
    • padding (ListInt) - Required. A list of 4 integers. The number of pixels to add to each (top, bottom, left, right) side of the input.
    • dilations (ListInt) - Optional. A list of 4 integers. The dilation factor for each dimension of input. The dimension order is interpreted according to the data format of "x". The N and C dimensions must be set to 1. Defaults to [1, 1, 1, 1].
    • groups (Number) - Optional. An integer of type int32. The number of blocked connections from input channels to output channels. In_channels and out_channels must both be divisible by "groups". Defaults to 1.
    • deformable_groups (Number) - Optional. An integer of type int32. The number of deformable group partitions. In_channels must be divisible by "deformable_groups". Defaults to 1.
    • modulated (bool) - Optional. Specify version of DeformableConv2D, true means v2, false means v1, currently only support v2.
  • Constraints:

    None

  • Examples:

    >>> x = torch.rand(16, 32, 32, 32).npu()
    >>> weight = torch.rand(32, 32, 5, 5).npu()
    >>> offset = torch.rand(16, 75, 32, 32).npu()
    >>> output, _ = torch.npu_deformable_conv2d(x, weight, offset, None, kernel_size=[5, 5], stride = [1, 1, 1, 1], padding = [2, 2, 2, 2])
    >>> output.shape
    torch.Size([16, 32, 32, 32])

npu_mish(self) -> Tensor

Computes hyperbolic tangent of "x" element-wise.

  • Parameters:

    • self (Tensor) - A Tensor. Must be one of the following types: float16, float32.
  • Constraints:

    None

  • Examples:

    >>> x = torch.rand(10, 30, 10).npu()
    >>> y = torch.npu_mish(x)
    >>> y.shape
    torch.Size([10, 30, 10])

npu_anchor_response_flags(self, featmap_size, stride, num_base_anchors) -> Tensor

Generate the responsible flags of anchor in a single feature map.

  • Parameters:

    • self (Tensor) - Ground truth box, 2-D Tensor with shape [batch, 4].
    • featmap_size (ListInt) - The size of feature maps, listint.
    • strides (ListInt) - Stride of current level, listint.
    • num_base_anchors (Number) - The number of base anchors.
  • Constraints:

    None

  • Examples:

    >>> x = torch.rand(100, 4).npu()
    >>> y = torch.npu_anchor_response_flags(x, [60, 60], [2, 2], 9)
    >>> y.shape
    torch.Size([32400])

npu_yolo_boxes_encode(self, gt_bboxes, stride, performance_mode=False) -> Tensor

Generates bounding boxes based on yolo's "anchor" and "ground-truth" boxes. It is a customized mmdetection operator.

  • Parameters:

    • self (Tensor) - anchor boxes generated by the yolo training set. A 2D Tensor of type float32 or float16 with shape (N, 4). "N" indicates the number of ROIs, "N" indicates the number of ROIs, and the value "4" refers to (tx, ty, tw, th).
    • gt_bboxes (Tensor) - target of the transformation, e.g, ground-truth boxes. A 2D Tensor of type float32 or float16 with shape (N, 4). "N" indicates the number of ROIs, and 4 indicates "dx", "dy", "dw", and "dh".
    • strides (Tensor) - Scale for each box. A 1D Tensor of type int32 shape (N,). "N" indicates the number of ROIs.
  • performance_mode (bool) - Select performance mode, "high_precision" or "high_performance". select "high_precision" when input type is float32, the output tensor precision will be smaller than 0.0001, select "high_performance" when input type is float32, the ops will be best performance, but precision will be only smaller than 0.005.

  • Constraints:

    input anchor boxes only support maximum N=20480.

  • Examples:

    >>> anchor_boxes = torch.rand(2, 4).npu()
    >>> gt_bboxes = torch.rand(2, 4).npu()
    >>> stride = torch.tensor([2, 2], dtype=torch.int32).npu()
    >>> output = torch.npu_yolo_boxes_encode(anchor_boxes, gt_bboxes, stride, False)
    >>> output.shape
    torch.Size([2, 4])

npu_grid_assign_positive(self, overlaps, box_responsible_flags, max_overlaps, argmax_overlaps, gt_max_overlaps, gt_argmax_overlaps, num_gts, pos_iou_thr, min_pos_iou, gt_max_assign_all) -> Tensor

Performs Position Sensitive PS ROI Pooling Grad.

  • Parameters:

    • self (Tensor) - Tensor of type float16 or float32, shape (n, )
    • overlaps (Tensor) - A Tensor. Datatype is same as assigned_gt_inds. IOU between gt_bboxes and bboxes. shape(k, n)
    • box_responsible_flags (Tensor) - A Tensor. Support uint8. Flag to indicate whether box is responsible.
    • max_overlaps (Tensor) - A Tensor. Datatype is same as assigned_gt_inds. overlaps.max(axis=0).
    • argmax_overlaps (Tensor) - A Tensor. Support int32. overlaps.argmax(axis=0).
    • gt_max_overlaps (Tensor) - A Tensor. Datatype is same as assigned_gt_inds. overlaps.max(axis=1).
    • gt_argmax_overlaps (Tensor) - A Tensor. Support int32. overlaps.argmax(axis=1).
    • num_gts (Number) - A Tensor. Support int32. real k. shape (1, )
    • pos_iou_thr (Float) - loat. IOU threshold for positive bboxes.
    • min_pos_iou (Float) - float. minimum iou for a bbox to be considered as a positive bbox
    • gt_max_assign_all (bool) - bool. whether to assign all bboxes with the same highest overlap with some gt to that gt.
  • Constraints:

    None

  • Examples:

    >>> assigned_gt_inds = torch.rand(4).npu()
    >>> overlaps = torch.rand(2,4).npu()
    >>> box_responsible_flags = torch.tensor([1, 1, 1, 0], dtype=torch.uint8).npu()
    >>> max_overlap = torch.rand(4).npu()
    >>> argmax_overlap = torch.tensor([1, 0, 1, 0], dtype=torch.int32).npu()
    >>> gt_max_overlaps = torch.rand(2).npu()
    >>> gt_argmax_overlaps = torch.tensor([1, 0],dtype=torch.int32).npu()
    >>> output = torch.npu_grid_assign_positive(assigned_gt_inds, overlaps, box_responsible_flags, max_overlap, argmax_overlap, gt_max_overlaps, gt_argmax_overlaps, 128, 0.5, 0., True)
    >>> output.shape
    torch.Size([4])

npu_normalize_batch(self, seq_len, normalize_type=0) -> Tensor

Performs batch normalization .

  • Parameters:

    • self (Tensor) - A Tensor. Support float32. shape (n, c, d).
    • seq_len (Tensor) - A Tensor. Each batch normalize data num. Support Int32. Shape (n, ).
    • normalize_type (Number) - Str. Support "per_feature" or "all_features".
  • constraints:

    None

  • Examples:

    >>> a=np.random.uniform(1,10,(2,3,6)).astype(np.float32)
    >>> b=np.random.uniform(3,6,(2)).astype(np.int32)
    >>> x=torch.from_numpy(a).to("npu")
    >>> seqlen=torch.from_numpy(b).to("npu")
    >>> out = torch.npu_normalize_batch(x, seqlen, 0)
    >>> out
    tensor([[[ 1.1496, -0.6685, -0.4812,  1.7611, -0.5187,  0.7571],
            [ 1.1445, -0.4393, -0.7051,  1.0474, -0.2646, -0.1582],
            [ 0.1477,  0.9179, -1.0656, -6.8692, -6.7437,  2.8621]],
    
            [[-0.6880,  0.1337,  1.3623, -0.8081, -1.2291, -0.9410],
            [ 0.3070,  0.5489, -1.4858,  0.6300,  0.6428,  0.0433],
            [-0.5387,  0.8204, -1.1401,  0.8584, -0.3686,  0.8444]]],
          device='npu:0')

npu_masked_fill_range(self, start, end, value, axis=-1) -> Tensor

masked fill tensor along with one axis by range.boxes. It is a customized masked fill range operator .

  • Parameters:

    • self (Tensor) - input tensor. A ND Tensor of float32/float16/int32/int8 with shapes 1-D (D,), 2-D(N, D), 3-D(N, C, D).
    • start (Tensor) - masked fill start pos. A 3D Tensor of int32 with shape (num, N).
    • end (Tensor) - masked fill end pos. A 3D Tensor of int32 with shape (num, N).
    • value (Tensor) - masked fill value. A 2D Tensor of float32/float16/int32/int8 with shape (num,).
    • axis (Number) - axis with masked fill of int32. Defaults to -1.
  • constraints:

    None

  • Examples:

    >>> a=torch.rand(4,4).npu()
    >>> a
    tensor([[0.9419, 0.4919, 0.2874, 0.6560],
            [0.6691, 0.6668, 0.0330, 0.1006],
            [0.3888, 0.7011, 0.7141, 0.7878],
            [0.0366, 0.9738, 0.4689, 0.0979]], device='npu:0')
    >>> start = torch.tensor([[0,1,2]], dtype=torch.int32).npu()
    >>> end = torch.tensor([[1,2,3]], dtype=torch.int32).npu()
    >>> value = torch.tensor([1], dtype=torch.float).npu()
    >>> out = torch.npu_masked_fill_range(a, start, end, value, 1)
    >>> out
    tensor([[1.0000, 0.4919, 0.2874, 0.6560],
            [0.6691, 1.0000, 0.0330, 0.1006],
            [0.3888, 0.7011, 1.0000, 0.7878],
            [0.0366, 0.9738, 0.4689, 0.0979]], device='npu:0')

npu_linear(input, weight, bias=None) -> Tensor

Multiplies matrix "a" by matrix "b", producing "a * b" .

  • Parameters:

    • input (Tensor) - A matrix Tensor. 2D. Must be one of the following types: float32, float16, int32, int8. Has format [ND, NHWC, FRACTAL_NZ].
    • weight (Tensor) - A matrix Tensor. 2D. Must be one of the following types: float32, float16, int32, int8. Has format [ND, NHWC, FRACTAL_NZ].
    • bias (Tensor) - A 1D Tensor. Must be one of the following types: float32, float16, int32. Has format [ND, NHWC].
  • constraints:

    None

  • Examples:

    >>> x=torch.rand(2,16).npu()
    >>> w=torch.rand(4,16).npu()
    >>> b=torch.rand(4).npu()
    >>> output = torch.npu_linear(x, w, b)
    >>> output
    tensor([[3.6335, 4.3713, 2.4440, 2.0081],
            [5.3273, 6.3089, 3.9601, 3.2410]], device='npu:0')

npu_bert_apply_adam.old(Tensor(a!) var, Tensor(b!) m, Tensor(c!) v, lr, beta1, beta2, epsilon, grad, max_grad_norm, global_grad_norm, weight_decay, step_size=None, adam_mode=0) -> (Tensor(a!), Tensor(b!), Tensor(c!))

count adam result.

  • Parameters:

    • var (Tensor) - A Tensor. Support float16/float32.
    • m(Tensor) - A Tensor. Datatype and shape are same as exp_avg.
    • v(Tensor) - A Tensor. Datatype and shape are same as exp_avg.
    • lr (Number) - A Tensor. Datatype is same as exp_avg.
    • beta1 (Number) - A Tensor. Datatype is same as exp_avg.
    • beta2 (Number) - A Tensor. Datatype is same as exp_avg.
    • epsilon (Number) - A Tensor. Datatype is same as exp_avg.
    • grad(Tensor) - A Tensor. Datatype and shape are same as exp_avg.
    • max_grad_norm (Number) - A Tensor. Datatype is same as exp_avg.
    • global_grad_norm (Number) - A Tensor. Datatype is same as exp_avg.
    • weight_decay (Number) - A Tensor. Datatype is same as exp_avg.
  • constraints:

    None

  • Examples:

    >>> var_in = torch.rand(321538).uniform_(-32., 21.).npu()
    >>> m_in = torch.zeros(321538).npu()
    >>> v_in = torch.zeros(321538).npu()
    >>> grad = torch.rand(321538).uniform_(-0.05, 0.03).npu()
    >>> max_grad_norm = -1.
    >>> beta1 = 0.9
    >>> beta2 = 0.99
    >>> weight_decay = 0.
    >>> lr = 0.
    >>> epsilon = 1e-06
    >>> global_grad_norm = 0.
    >>> var_out, m_out, v_out = torch.npu_bert_apply_adam(var_in, m_in, v_in, lr, beta1, beta2, epsilon, grad, max_grad_norm, global_grad_norm, weight_decay)
    >>> var_out
    tensor([ 14.7733, -30.1218,  -1.3647,  ..., -16.6840,   7.1518,   8.4872],
          device='npu:0')

npu_bert_apply_adam(lr, beta1, beta2, epsilon, grad, max_grad_norm, global_grad_norm, weight_decay, step_size=None, adam_mode=0, *, out=(var,m,v))

count adam result.

  • Parameters:

    • var (Tensor) - A Tensor. Support float16/float32.
    • m (Tensor) - A Tensor. Datatype and shape are same as exp_avg.
    • v (Tensor) - A Tensor. Datatype and shape are same as exp_avg.
    • lr (Number) - Datatype is same as exp_avg.
    • beta1 (Number) - Datatype is same as exp_avg.
    • beta2 (Number) - Datatype is same as exp_avg.
    • epsilon (Number) - Datatype is same as exp_avg.
    • grad (Tensor) - A Tensor. Datatype and shape are same as exp_avg.
    • max_grad_norm (Number) - Datatype is same as exp_avg.
    • global_grad_norm (Number) - Datatype is same as exp_avg.
    • weight_decay (Number) - Datatype is same as exp_avg.
  • Keyword Arguments :

    • out :A Tensor, optional. The output tensor.
  • constraints:

    None

  • Examples:

    >>> var_in = torch.rand(321538).uniform_(-32., 21.).npu()
    >>> m_in = torch.zeros(321538).npu()
    >>> v_in = torch.zeros(321538).npu()
    >>> grad = torch.rand(321538).uniform_(-0.05, 0.03).npu()
    >>> max_grad_norm = -1.
    >>> beta1 = 0.9
    >>> beta2 = 0.99
    >>> weight_decay = 0.
    >>> lr = 0.
    >>> epsilon = 1e-06
    >>> global_grad_norm = 0.
    >>> var_out, m_out, v_out = torch.npu_bert_apply_adam(lr, beta1, beta2, epsilon, grad, max_grad_norm, global_grad_norm, weight_decay, out=(var_in, m_in, v_in))
    >>> var_out
    tensor([ 14.7733, -30.1218,  -1.3647,  ..., -16.6840,   7.1518,   8.4872],
          device='npu:0')

npu_giou(self, gtboxes, trans=False, is_cross=False, mode=0) -> Tensor

First calculate the minimum closure area of the two boxes, IoU, the proportion of the closed area that does not belong to the two boxes in the closure area, and finally subtract this proportion from IoU to get GIoU .

  • Parameters:

    • self (Tensor) - Bounding boxes, a 2D Tensor of type float16 or float32 with shape (N, 4). "N" indicates the number of bounding boxes, and the value "4" refers to [x1, y1, x2, y2] or [x, y, w, h].
    • gtboxes (Tensor) - Ground-truth boxes, a 2D Tensor of type float16 or float32 with shape (M, 4). "M" indicates the number of ground truth boxes, and the value "4" refers to [x1, y1, x2, y2] or [x, y, w, h].
    • trans (bool) - An optional bool, true for 'xywh', false for 'xyxy'.
    • is_cross (bool) - An optional bool, control whether the output shape is [M, N] or [1, N].
    • mode: (Number) - Computation mode, a character string with the value range of [iou, iof] .
  • constraints:

    None

  • Examples:

    >>> a=np.random.uniform(0,1,(4,10)).astype(np.float16)
    >>> b=np.random.uniform(0,1,(4,10)).astype(np.float16)
    >>> box1=torch.from_numpy(a).to("npu")
    >>> box2=torch.from_numpy(a).to("npu")
    >>> output = torch.npu_giou(box1, box2, trans=True, is_cross=False, mode=0)
    >>> output
    tensor([[1.],
            [1.],
            [1.],
            [1.],
            [1.],
            [1.],
            [1.],
            [1.],
            [1.],
            [1.]], device='npu:0', dtype=torch.float16)

npu_silu(self) -> Tensor

Computes the for the Swish of "x" .

  • Parameters:

    • self (Tensor) - A Tensor. Must be one of the following types: float16, float32
  • constraints:

    None

  • Examples:

>>> a=torch.rand(2,8).npu()
>>> output = torch.npu_silu(a)
>>> output
tensor([[0.4397, 0.7178, 0.5190, 0.2654, 0.2230, 0.2674, 0.6051, 0.3522],
        [0.4679, 0.1764, 0.6650, 0.3175, 0.0530, 0.4787, 0.5621, 0.4026]],
       device='npu:0')

npu_reshape(self, shape, bool can_refresh=False) -> Tensor

Reshapes a tensor. Only the tensor shape is changed, without changing the data.

  • Parameters:

    • self (Tensor) - A Tensor.
    • shape (ListInt) - Defines the shape of the output tensor.
    • can_refresh (bool) - Used to specify whether reshape can be refreshed in place.
  • constraints:

    This operator cannot be directly called by the acllopExecute API.

  • Examples:

    >>> a=torch.rand(2,8).npu()
    >>> out=torch.npu_reshape(a,(4,4))
    >>> out
    tensor([[0.6657, 0.9857, 0.7614, 0.4368],
            [0.3761, 0.4397, 0.8609, 0.5544],
            [0.7002, 0.3063, 0.9279, 0.5085],
            [0.1009, 0.7133, 0.8118, 0.6193]], device='npu:0')

npu_rotated_overlaps(self, query_boxes, trans=False) -> Tensor

Calculate the overlapping area of the rotated box.

  • Parameters:

    • self (Tensor) - data of grad increment, a 3D Tensor of type float32 with shape (B, 5, N).
    • query_boxes (Tensor) - Bounding boxes, a 3D Tensor of type float32 with shape (B, 5, K).
    • trans (bool) - An optional attr, true for 'xyxyt', false for 'xywht'.
  • constraints:

    None

  • Examples:

    >>> a=np.random.uniform(0,1,(1,3,5)).astype(np.float16)
    >>> b=np.random.uniform(0,1,(1,2,5)).astype(np.float16)
    >>> box1=torch.from_numpy(a).to("npu")
    >>> box2=torch.from_numpy(a).to("npu")
    >>> output = torch.npu_rotated_overlaps(box1, box2, trans=False)
    >>> output
    tensor([[[0.0000, 0.1562, 0.0000],
            [0.1562, 0.3713, 0.0611],
            [0.0000, 0.0611, 0.0000]]], device='npu:0', dtype=torch.float16)

npu_rotated_iou(self, query_boxes, trans=False, mode=0, is_cross=True) -> Tensor

Calculate the IOU of the rotated box.

  • Parameters:

    • self (Tensor) - data of grad increment, a 3D Tensor of type float32 with shape (B, 5, N).
    • query_boxes (Tensor) - Bounding boxes, a 3D Tensor of type float32 with shape (B, 5, K).
    • trans (bool) - An optional attr, true for 'xyxyt', false for 'xywht'.
    • is_cross (bool) -Cross calculation when it is True, and one-to-one calculation when it is False.
    • mode (Number) - Computation mode, a character string with the value range of [iou, iof, giou] .
  • constraints:

    None

  • Examples:

    >>> a=np.random.uniform(0,1,(2,2,5)).astype(np.float16)
    >>> b=np.random.uniform(0,1,(2,3,5)).astype(np.float16)
    >>> box1=torch.from_numpy(a).to("npu")
    >>> box2=torch.from_numpy(a).to("npu")
    >>> output = torch.npu_rotated_iou(box1, box2, trans=False, mode=0, is_cross=True)
    >>> output
    tensor([[[3.3325e-01, 1.0162e-01],
            [1.0162e-01, 1.0000e+00]],
    
            [[0.0000e+00, 0.0000e+00],
            [0.0000e+00, 5.9605e-08]]], device='npu:0', dtype=torch.float16)

npu_rotated_box_encode(anchor_box, gt_bboxes, weight) -> Tensor

Rotate Bounding Box Encoding.

  • Parameters:

    • anchor_box (Tensor) - A 3D Tensor with shape (B, 5, N). the input tensor.Anchor boxes. "B" indicates the number of batch size, "N" indicates the number of bounding boxes, and the value "5" refers to "x0", "x1", "y0", "y1" and "angle" .
    • gt_bboxes (Tensor) - A 3D Tensor of float32 (float16) with shape (B, 5, N).
    • weight (Tensor) - A float list for "x0", "x1", "y0", "y1" and "angle", defaults to [1.0, 1.0, 1.0, 1.0, 1.0].
  • constraints:

    None

  • Examples:

    >>> anchor_boxes = torch.tensor([[[30.69], [32.6], [45.94], [59.88], [-44.53]]], dtype=torch.float16).to("npu")
        >>> gt_bboxes = torch.tensor([[[30.44], [18.72], [33.22], [45.56], [8.5]]], dtype=torch.float16).to("npu")
        >>> weight = torch.tensor([1., 1., 1., 1., 1.], dtype=torch.float16).npu()
        >>> out = torch.npu_rotated_box_encode(anchor_boxes, gt_bboxes, weight)
        >>> out
        tensor([[[-0.4253],
                [-0.5166],
                [-1.7021],
                [-0.0162],
                [ 1.1328]]], device='npu:0', dtype=torch.float16)

    npu_rotated_box_decode(anchor_boxes, deltas, weight) -> Tensor

    Rotate Bounding Box Encoding

    • Parameters:

      • anchor_box (Tensor) - A 3D Tensor with shape (B, 5, N). the input tensor.Anchor boxes. "B" indicates the number of batch size, "N" indicates the number of bounding boxes, and the value "5" refers to "x0", "x1", "y0", "y1" and "angle" .
      • deltas (Tensor) - A 3D Tensor of float32 (float16) with shape (B, 5, N).
      • weight (Tensor) - A float list for "x0", "x1", "y0", "y1" and "angle", defaults to [1.0, 1.0, 1.0, 1.0, 1.0].
    • constraints:

      None

    • Examples:

       >>> anchor_boxes = torch.tensor([[[4.137],[33.72],[29.4], [54.06], [41.28]]], dtype=torch.float16).to("npu")
          >>> deltas = torch.tensor([[[0.0244], [-1.992], [0.2109], [0.315], [-37.25]]], dtype=torch.float16).to("npu")
          >>> weight = torch.tensor([1., 1., 1., 1., 1.], dtype=torch.float16).npu()
          >>> out = torch.npu_rotated_box_decode(anchor_boxes, deltas, weight)
          >>> out
          tensor([[[  1.7861],
                  [-10.5781],
                  [ 33.0000],
                  [ 17.2969],
                  [-88.4375]]], device='npu:0', dtype=torch.float16)
Python
1
https://gitee.com/ascend/pytorch.git
git@gitee.com:ascend/pytorch.git
ascend
pytorch
pytorch
2.0.4.tr5

搜索帮助