From 9269bfa241e0b8b06ea6d48d2e3fc83f2d770592 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E7=86=8A=E6=94=80?= Date: Tue, 14 Oct 2025 09:42:38 +0800 Subject: [PATCH] =?UTF-8?q?=E6=9B=B4=E6=96=B0=E7=AE=97=E5=AD=90=E6=94=AF?= =?UTF-8?q?=E6=8C=81=E5=88=97=E8=A1=A8?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../source_en/reference/operator_list_lite.md | 400 +++++++++--------- .../reference/operator_list_lite.md | 400 +++++++++--------- 2 files changed, 400 insertions(+), 400 deletions(-) diff --git a/docs/lite/docs/source_en/reference/operator_list_lite.md b/docs/lite/docs/source_en/reference/operator_list_lite.md index fb35b21449..d0a0a1af07 100644 --- a/docs/lite/docs/source_en/reference/operator_list_lite.md +++ b/docs/lite/docs/source_en/reference/operator_list_lite.md @@ -2,203 +2,203 @@ [![View Source On Gitee](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source_en.svg)](https://gitee.com/mindspore/docs/blob/master/docs/lite/docs/source_en/reference/operator_list_lite.md) -| Operator Names | Operator Functions | CPU | Kirin NPU | GPU (Mali/Adreno) | -| ----------------------------------- | ------------------------------------------------------------ | --------------------------------------------------- | --------- | ----------------------- | -| Abs | Element-wise calculate the absolute value | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| AbsGrad | Compute the gradient of the absolute value function | FP32 | - | - | -| Activation | Activation functions | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| ActivationGrad | Calculate the gradient of a specific activation function | FP16
FP32 | - | - | -| Adam | Executing a single parameter update step of the Adam optimizer | FP32 | - | - | -| AddFusion | Element-wise addition computation | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int8 | -| AdderFusion | Addition-based convolution operation | FP32 | - | - | -| AddGrad | Compute the gradient of the addition operation | FP32 | - | - | -| AddN | Perform element-wise addition on N input tensors of identical shape and data type. | FP16
FP32 | - | - | -| Affine | Perform an affine transformation on the input tensor. | FP32 | - | - | -| All | Determine whether all elements in the tensor are True (non-zero) along the specified dimension. | FP32 | - | - | -| AllGather | Distributed collection communication operations | FP32 | - | - | -| ApplyMomentum | Execute a single parameter update step of stochastic gradient descent for momentum. | FP32 | - | - | -| Assert | Assertion | FP16
FP32
Bool | - | - | -| Assign | Assign a value to a variable | FP32 | - | - | -| ArgmaxFusion | Find the maximum value in a given dimension | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| ArgminFusion | Find the minimum value in a given dimension | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | -| AvgPoolFusion | Average pooling | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| AvgPoolGrad | Compute the gradients for the average pooling layer | FP16
FP32 | - | - | -| BatchNorm | Batch normalization | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | -| BatchNormGrad | Compute the gradient of the batch normalization layer | FP16
FP32 | - | - | -| BatchToSpace | Inverse operation of space-to-batch transformation | FP32
Int8
UInt8 | - | FP16
FP32 | -| BatchToSpaceND | ND universal version of BatchToSpace | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | -| BiasAdd | Add the bias vector to the input tensor | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | -| BiasAddGrad | The gradient of the BiasAdd operation | FP16
FP32 | - | - | -| BinaryCrossEntropy | Calculate the binary cross-entropy loss | FP32 | - | - | -| BinaryCrossEntropyGrad | Calculate the gradient of the binary cross-entropy loss function | FP32 | - | - | -| BroadcastTo | Expansion of dimensions | FP16
FP32
Int32
Bool | - | - | -| Call | Call a subgraph or function | FP16
FP32
Int32
Bool | - | - | -| Cast | Data type conversion | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32 | -| Ceil | Round up to the nearest integer | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Clip | Restrict element ranges | FP32
Int32 | - | - | -| Concat | Concatenated Tensor | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | -| ConstantOfShape | Generate a tensor with the same shape as the input and fill it with the specified constant. | FP16
FP32
Int32 | - | - | -| Conv2DFusion | 2D convolution | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Conv2DBackpropFilterFusion | Compute the gradient of the convolution kernel with respect to the ordinary convolution operation. | FP16
FP32 | - | - | -| Conv2DBackpropInputFusion | Compute the gradient of the input data with respect to the standard convolution operation. | FP16
FP32 | - | - | -| Conv2dTransposeFusion | Perform transposed convolution operations | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Cos | Element-wise cosine calculation | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Crop | Crop a specified region from an input image or feature map. | FP16
FP32
Int32
Int8
UInt8 | - | - | -| CropAndResize | Crop regions from the input image based on a set of bounding boxes, then resize each region to a uniform size. | FP32 | FP16 | - | -| CumSum | Cumulative sum of elements | FP32
Int32 | - | - | -| CustomExtractFeatures | Extract operators based on custom feature | FP32 | - | - | -| CustomNormalize | Custom normalized operator | FP32 | - | - | -| CustomPredict | Custom prediction operator | FP32
Int32 | - | - | -| DEConv2DGradFilter | Compute the gradient of the transposed convolution with respect to the convolution kernel. | FP32 | - | | -| DepthToSpace | Rearrange deep data into spatial dimensions | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | -| DetectionPostProcess | Post-processing of object detection | FP32
Int8
UInt8 | - | - | -| DivFusion | Element-wise division | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| DivGrad | Compute the gradient of the division operation | FP32 | - | - | -| Dropout | Randomly set some elements of the input tensor to zero. | FP16
FP32 | - | - | -| DropoutGrad | Compute the gradient of the Dropout operation | FP16
FP32 | - | - | -| DynamicQuant | Dynamically quantize floating-point tensors to uint8 type | FP32 | - | - | -| Eltwise | Element-level operations | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Elu | Activation function, applying exponential correction to negative inputs | FP16
FP32 | - | - | -| Equal | Determine whether inputs are equal | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| EmbeddingLookupFusion | Optimized word embedding lookup, mapping integer indices to dense vectors | FP32 | - | - | -| Erf | Error functions | FP16
FP32 | - | - | -| ExpFusion | Element-wise exponentiation | FP16
FP32 | - | FP16
FP32 | -| ExpandDims | Insert a dimension of length 1 at the specified position | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | -| Fill | Generate a tensor filled with the specified constant. | FP16
FP32
Int32
Bool | - | FP16
FP32 | -| Flatten | Data is expanded by dimension | FP16
FP32
Int32 | - | - | -| FlattenGrad | Compute the gradient of the Flatten operation | FP16
FP32 | - | - | -| Floor | Round down to the nearest integer | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| FloorDiv | Element-wise division down to the nearest integer | FP16
FP32
Int32 | FP16 | FP16
FP32 | -| FloorMod | Element-wise modulo operation: the sign of the result matches that of the divisor. | FP16
FP32
Int32 | FP16 | FP16
FP32 | -| FullConnection | Fully-connected layer | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| FusedBatchNorm | Standardize the input | FP16
FP32
Int8
UInt8 | FP16 | - | -| GatherNd | Collect elements from the input tensor at specified positions based on the index tensor. | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32 | -| Gather | Collect elements at specified index positions along a single dimension | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | -| GatherD | Collect elements from the input tensor based on the index tensor. | FP16
FP32
Int32
Bool | - | - | -| GLU | Gated linear unit activation function splits the input into two parts and performs element-wise multiplication. | FP32 | - | - | -| Greater | Perform element-wise comparison between two tensors, returning a logical result (True/False) indicating whether A > B. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| GreaterEqual | Perform element-wise comparison between two tensors, returning a logical result (True/False) indicating whether A ≥ B. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| GroupNormFusion | Group normalization for fusion optimization | FP32 | - | - | -| GRU | Gated recurrent unit, simplified LSTM | FP16
FP32 | - | - | -| HashtableLookup | Hash table lookup | FP32
Int32 | - | - | -| InstanceNorm | Instance normalization | FP16
FP32 | FP16 | - | -| InvertPermutation | Inverted replacement index | FP16
FP32
Int32 | - | - | -| IsFinite | Check whether each element in the tensor is finite (not inf/NaN) | FP32 | - | - | -| L2NormalizeFusion | L2 normalization for fusion optimization | FP32
Int8
UInt8 | - | - | -| LayerNormFusion | Layer normalization for fusion optimization | FP16
FP32
Int8 | - | FP16
FP32 | -| LayerNormGrad | Compute layer normalization gradients | FP16
FP32 | - | - | -| LeakyReLU | Leaky ReLU activation function, which assigns a small slope to negative inputs. | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Less | Perform element-wise comparison between two tensors, returning a logical result indicating whether A < B. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| LessEqual | Perform element-wise comparison: A ≤ B, returns a Boolean tensor | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| LRN | Local response normalization | FP32 | - | - | -| Log | Element-wise calculate the logarithm | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Log1p | Calculate log(1+X) | FP32 | - | - | -| LogGrad | Calculate the gradient of the logarithmic function | FP16
FP32 | - | - | -| LogicalAnd | Element-wise logical AND operation | FP16
FP32
Int32
Bool | FP16 | FP16
FP32 | -| LogicalNot | Element-level logical NOT operation | FP16
FP32
Int8
UInt8
Bool | FP16 | FP16
FP32 | -| LogicalOr | Element-wise logical OR operation | FP16
FP32
Bool | FP16 | FP16
FP32 | -| LogSoftmax | Perform a softmax operation on the input vector, then take the logarithm of the softmax result. | FP16
FP32 | - | - | -| LshProjection | Locality-sensitive hash projection | FP32 | - | - | -| LSTM | Long-term and short-term memory network unit | FP16
FP32 | - | - | -| LSTMGrad | Calculate the backward propagation gradient of the LSTM for the hidden state | FP32 | - | - | -| LSTMGradData | Compute the backpropagation gradient of the LSTM for the input data | FP32 | - | - | -| LSTMGradWeight | Calculate the backward propagation gradient of weights for the LSTM | FP32 | - | - | -| MatMulFusion | Perform matrix multiplication on two inputs | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Maximum | Find the maximum value at the element level | FP16
FP32
Int32 | FP16 | FP16
FP32 | -| MaximumGrad | Calculate the gradient of the maximum value function | FP16
FP32 | - | - | -| MaxPoolFusion | Maximum pooling | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| MaxPoolGrad | Compute the gradients for the max-pooling layer | FP16
FP32 | - | - | -| Merge | Create a new tensor with the exact same shape as the input tensor X, but with all element values set to 1. | FP16
FP32 | - | - | -| Minimum | Find the minimum value at the element level | FP16
FP32
Int32 | FP16 | FP16
FP32 | -| MinimumGrad | Compute the gradient of the minimum value function | FP16
FP32 | - | - | -| Mod | Return the remainder of the division operation | FP32
Int32 | - | - | -| MulFusion | Element-wise multiplication | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| MulGrad | Compute the gradient of the multiplication operation | FP32 | - | - | -| Neg | Element-wise find negative numbers | FP16
FP32
Int32 | FP16 | FP16
FP32 | -| NegGrad | Compute the gradient of the negation operation | FP16
FP32 | - | - | -| NLLLoss | Compute the negative log-likelihood loss | FP32 | - | - | -| NLLLossGrad | Compute the gradient of NLLLoss | FP32 | - | - | -| NotEqual | Performs element-wise comparison between two tensors and returns the logical result indicating whether A != B. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| NonMaxSuppression | Non-maximum suppression | FP32 | - | - | -| NonZero | Return the indices of all non-zero elements in the input tensor. | Bool | - | - | -| OneHot | Convert integer index tensors to one-hot encoding representations | FP16
FP32
Int32 | - | FP16
FP32
Int32 | -| OnesLike | Create a new tensor with the exact same shape as the input tensor X, but with all element values set to 1. | FP16
FP32
Int32 | - | - | -| PadFusion | Add specified padding to the input tensor, to achieve the desired size. | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| PartialFusion | Partial fusion | FP16
FP32
Int32
Bool | - | - | -| PowFusion | Element-wise exponentiation | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | -| PowerGrad | Compute the gradient of the power operation | FP32 | - | - | -| PriorBox | Generate prior boxes | FP32
Int8
UInt8 | - | - | -| PReLUFusion | PRelu activation function | FP16
FP32 | - | FP16
FP32 | -| QuantDTypeCast | Perform quantitative data type conversion | FP16
FP32
Int8
UInt8 | - | - | -| RaggedRange | Generate sequences with non-uniform intervals | FP16
FP32
Int32 | - | - | -| RandomNormal | Generate a tensor whose values are randomly sampled from a normal distribution | FP16
FP32 | - | - | -| RandomStandardNormal | Generate a random tensor following a standard normal distribution | FP16
FP32 | - | - | -| Range | Generate elements within a specified range | FP16
FP32
Int32 | - | - | -| Rank | Return the number of dimensions in the input tensor | FP16
FP32 | - | - | -| RealDiv | Element-wise division | FP16
FP32 | - | - | -| Reciprocal | Return reciprocals | FP16
FP32
Int8 | FP16 | - | -| ReduceFusion | Reduction operation | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32 | -| ReduceScatter | Distributed operations: Input tensors are segmented and distributed across devices, with each device retaining only one segment of the results. | FP32 | - | - | -| Reshape | Changing the shape of a tensor while keeping the total number of elements unchanged | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | -| Resize | Upsample or resize the input tensor | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| ResizeGrad | Compute the gradient for Resize | FP16
FP32 | - | - | -| ReverseV2 | Reverse the tensor along the specified axis | FP32
Int32 | - | - | -| ReverseSequence | Partially reverse the variable-length sequence of the input tensor. | FP32 | - | - | -| ROIPooling | Regional interest pooling | FP32 | - | - | -| Round | Round to the nearest whole number | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Rsqrt | Element-wise compute square roots and reciprocals for normalization. | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| RsqrtGrad | Calculate the gradient of the reciprocal of the square root | FP32 | - | - | -| Select | Select elements from two tensors based on conditions | FP32
Bool | - | - | -| Selu | Self-normalizing index linear unit activation function | - | - | - | -| ScaleFusion | Fuse scaling operations with adjacent operators | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| ScatterNd | Scatter values from the input tensor to specified positions in the output tensor based on the index. | FP16
FP32
Int32 | - | - | -| ScatterNdUpdate | Update the value of the input data using the given value and the input index. | FP16
FP32
Int32 | - | - | -| SGD | Stochastic gradient descent optimizer | FP32 | - | - | -| Shape | Obtain the tensor shape | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32 | -| SigmoidCrossEntropyWithLogits | Combine Sigmoid activation and cross-entropy loss | FP32 | - | - | -| SigmoidCrossEntropyWithLogitsGrad | Compute the gradient of the cross-entropy loss with sigmoid | FP32 | - | - | -| Sin | Element-wise calculation of sine | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Size | Obtain tensor dimension size | FP16
FP32
Int32 | - | - | -| SliceFusion | Tensor slicing operation | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| SkipGram | The core operation of the Skip-gram model, used for training word vectors | FP32 | - | - | -| SmoothL1Loss | Smooth L1 Loss | FP32 | - | - | -| SmoothL1LossGrad | Compute the gradient of the L1 loss | FP32 | - | - | -| Softmax | Normalization operation | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| SoftmaxGrad | Calculate the gradient of Softmax | FP32 | - | - | -| Softplus | Smooth ReLU variants | FP16
FP32 | - | - | -| SpaceToBatch | Move the values of the height and width dimensions to the depth dimension. | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | -| SpaceToBatchND | Split spatial-dimensional data blocks into batch dimensions | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | -| SpaceToDepth | Reorganize spatial data into depth channels | FP16
FP32 | - | FP16
FP32 | -| SparseToDense | Convert sparse representations to dense tensors | FP16
FP32
Int32 | - | FP16
FP32
Int32 | -| SparseSoftmaxCrossEntropyWithLogits | Softmax cross-entropy for sparse labels | FP32 | - | - | -| Splice | Connect multiple slices or ranges of the input tensor along the specified axis. | FP16
FP32 | - | - | -| Split | Split the input tensor into multiple smaller output tensors along the specified axis. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| SplitWithOverlap | Overlapped split tensor | FP16
FP32 | - | - | -| Sqrt | Element-wise take the square root | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| SqrtGrad | Calculate the gradient of the square root | FP32 | - | - | -| Square | Element-wise square | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| SquaredDifference | Element-wise compute (A-B)² | FP16
FP32 | - | FP16
FP32 | -| Squeeze | Remove dimension of size 1 | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32
Int32 | -| StridedSlice | Tensor slicing | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| StridedSliceGrad | Compute the gradient of the slice operation | FP16
FP32 | - | - | -| Stack | Stack multiple tensors along the new axis | FP16
FP32
Int32 | - | FP16
FP32 | -| SubFusion | Element-wise subtraction | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| SubGrad | Calculate the gradient of subtraction | FP32 | - | - | -| Switch | Select output branches based on Boolean conditions | FP16
FP32
Int32
Bool | - | - | -| SwitchLayer | Select different subnetwork branches for execution within the model | FP16
FP32
Int32
Bool | - | - | -| TensorListFromTensor | Convert a regular tensor into a list of tensors, splitting along the specified axis. | FP16
FP32
Int32 | - | - | -| TensorListGetItem | Retrieve the tensor at the specified index position from the tensor list | FP16
FP32
Int32 | - | - | -| TensorListReserve | Preallocate an empty array list, specifying the element data type and initial capacity. | FP16
FP32
Int32 | - | - | -| TensorListSetItem | Insert a tensor into a specified position in a list of tensors | FP16
FP32
Int32 | - | - | -| TensorListStack | Stack the list of tensors into a single regular tensor | FP16
FP32
Int32 | - | - | -| TensorScatterAdd | Add the updated tensor values to the specified positions in the target tensor using the index. | FP32
Int32 | - | - | -| TileFusion | Flatten the given matrix | FP16
FP32
Int32
Bool | FP16 | - | -| TopKFusion | Return the top K elements from the input tensor. | FP16
FP32
Int32
Int8
UInt8 | - | - | -| Transpose | Tensor transpose | FP16
FP32
Int32
Int8
Bool | FP16 | FP16
FP32 | -| UniformReal | Generate a random tensor following a uniform distribution | FP32
Int32 | - | - | -| Unique | Returns the unique values in the input tensor, along with their indices and count. | FP16
FP32
Int32 | - | - | -| UnsortedSegmentSum | Perform segmented summation on the tensor without requiring ordered segmented indices. | FP16
FP32
Int32 | - | - | -| Unsqueeze | Add a new dimension to the input tensor | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | -| Unstack | Split a tensor into multiple sub-tensors along a specified axis | FP16
FP32
Int32 | - | - | -| Where | Element selection | FP16
FP32
Int32
Bool | - | - | -| ZerosLike | Generate a new tensor with the same shape as the input tensor but with all elements set to zero. | FP16
FP32
Int32 | - | - | +| Operator Names | Operator Functions | CPU | Kirin NPU | GPU (Mali/Adreno) | Ascend | +| ----------------------------------- | ------------------------------------------------------------ | --------------------------------------------------- | --------- | ----------------------- | ----------------------- | +| Abs | Element-wise calculate the absolute value | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| AbsGrad | Compute the gradient of the absolute value function | FP32 | - | - | | +| Activation | Activation functions | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| ActivationGrad | Calculate the gradient of a specific activation function | FP16
FP32 | - | - | | +| Adam | Executing a single parameter update step of the Adam optimizer | FP32 | - | - | | +| AddFusion | Element-wise addition computation | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int8 | FP16 | +| AdderFusion | Addition-based convolution operation | FP32 | - | - | | +| AddGrad | Compute the gradient of the addition operation | FP32 | - | - | | +| AddN | Perform element-wise addition on N input tensors of identical shape and data type. | FP16
FP32 | - | - | | +| Affine | Perform an affine transformation on the input tensor. | FP32 | - | - | FP16 | +| All | Determine whether all elements in the tensor are True (non-zero) along the specified dimension. | FP32 | - | - | | +| AllGather | Distributed collection communication operations | FP32 | - | - | | +| ApplyMomentum | Execute a single parameter update step of stochastic gradient descent for momentum. | FP32 | - | - | FP16 | +| Assert | Assertion | FP16
FP32
Bool | - | - | | +| Assign | Assign a value to a variable | FP32 | - | - | FP16 | +| ArgmaxFusion | Find the maximum value in a given dimension | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| ArgminFusion | Find the minimum value in a given dimension | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 | +| AvgPoolFusion | Average pooling | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| AvgPoolGrad | Compute the gradients for the average pooling layer | FP16
FP32 | - | - | | +| BatchNorm | Batch normalization | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 | +| BatchNormGrad | Compute the gradient of the batch normalization layer | FP16
FP32 | - | - | | +| BatchToSpace | Inverse operation of space-to-batch transformation | FP32
Int8
UInt8 | - | FP16
FP32 | | +| BatchToSpaceND | ND universal version of BatchToSpace | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | | +| BiasAdd | Add the bias vector to the input tensor | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 | +| BiasAddGrad | The gradient of the BiasAdd operation | FP16
FP32 | - | - | | +| BinaryCrossEntropy | Calculate the binary cross-entropy loss | FP32 | - | - | FP16 | +| BinaryCrossEntropyGrad | Calculate the gradient of the binary cross-entropy loss function | FP32 | - | - | | +| BroadcastTo | Expansion of dimensions | FP16
FP32
Int32
Bool | - | - | | +| Call | Call a subgraph or function | FP16
FP32
Int32
Bool | - | - | FP16 | +| Cast | Data type conversion | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32 | FP16 | +| Ceil | Round up to the nearest integer | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Clip | Restrict element ranges | FP32
Int32 | - | - | FP16 | +| Concat | Concatenated Tensor | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | FP16 | +| ConstantOfShape | Generate a tensor with the same shape as the input and fill it with the specified constant. | FP16
FP32
Int32 | - | - | | +| Conv2DFusion | 2D convolution | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Conv2DBackpropFilterFusion | Compute the gradient of the convolution kernel with respect to the ordinary convolution operation. | FP16
FP32 | - | - | | +| Conv2DBackpropInputFusion | Compute the gradient of the input data with respect to the standard convolution operation. | FP16
FP32 | - | - | | +| Conv2dTransposeFusion | Perform transposed convolution operations | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Cos | Element-wise cosine calculation | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Crop | Crop a specified region from an input image or feature map. | FP16
FP32
Int32
Int8
UInt8 | - | - | | +| CropAndResize | Crop regions from the input image based on a set of bounding boxes, then resize each region to a uniform size. | FP32 | FP16 | - | | +| CumSum | Cumulative sum of elements | FP32
Int32 | - | - | FP16 | +| CustomExtractFeatures | Extract operators based on custom feature | FP32 | - | - | | +| CustomNormalize | Custom normalized operator | FP32 | - | - | | +| CustomPredict | Custom prediction operator | FP32
Int32 | - | - | | +| DEConv2DGradFilter | Compute the gradient of the transposed convolution with respect to the convolution kernel. | FP32 | - | | | +| DepthToSpace | Rearrange deep data into spatial dimensions | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | | +| DetectionPostProcess | Post-processing of object detection | FP32
Int8
UInt8 | - | - | | +| DivFusion | Element-wise division | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| DivGrad | Compute the gradient of the division operation | FP32 | - | - | | +| Dropout | Randomly set some elements of the input tensor to zero. | FP16
FP32 | - | - | FP16 | +| DropoutGrad | Compute the gradient of the Dropout operation | FP16
FP32 | - | - | | +| DynamicQuant | Dynamically quantize floating-point tensors to uint8 type | FP32 | - | - | | +| Eltwise | Element-level operations | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Elu | Activation function, applying exponential correction to negative inputs | FP16
FP32 | - | - | FP16 | +| Equal | Determine whether inputs are equal | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| EmbeddingLookupFusion | Optimized word embedding lookup, mapping integer indices to dense vectors | FP32 | - | - | | +| Erf | Error functions | FP16
FP32 | - | - | FP16 | +| ExpFusion | Element-wise exponentiation | FP16
FP32 | - | FP16
FP32 | FP16 | +| ExpandDims | Insert a dimension of length 1 at the specified position | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | FP16 | +| Fill | Generate a tensor filled with the specified constant. | FP16
FP32
Int32
Bool | - | FP16
FP32 | FP16 | +| Flatten | Data is expanded by dimension | FP16
FP32
Int32 | - | - | FP16 | +| FlattenGrad | Compute the gradient of the Flatten operation | FP16
FP32 | - | - | | +| Floor | Round down to the nearest integer | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| FloorDiv | Element-wise division down to the nearest integer | FP16
FP32
Int32 | FP16 | FP16
FP32 | | +| FloorMod | Element-wise modulo operation: the sign of the result matches that of the divisor. | FP16
FP32
Int32 | FP16 | FP16
FP32 | | +| FullConnection | Fully-connected layer | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| FusedBatchNorm | Standardize the input | FP16
FP32
Int8
UInt8 | FP16 | - | FP16 | +| GatherNd | Collect elements from the input tensor at specified positions based on the index tensor. | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32 | FP16 | +| Gather | Collect elements at specified index positions along a single dimension | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | FP16 | +| GatherD | Collect elements from the input tensor based on the index tensor. | FP16
FP32
Int32
Bool | - | - | FP16 | +| GLU | Gated linear unit activation function splits the input into two parts and performs element-wise multiplication. | FP32 | - | - | | +| Greater | Perform element-wise comparison between two tensors, returning a logical result (True/False) indicating whether A > B. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| GreaterEqual | Perform element-wise comparison between two tensors, returning a logical result (True/False) indicating whether A ≥ B. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| GroupNormFusion | Group normalization for fusion optimization | FP32 | - | - | | +| GRU | Gated recurrent unit, simplified LSTM | FP16
FP32 | - | - | | +| HashtableLookup | Hash table lookup | FP32
Int32 | - | - | | +| InstanceNorm | Instance normalization | FP16
FP32 | FP16 | - | FP16 | +| InvertPermutation | Inverted replacement index | FP16
FP32
Int32 | - | - | | +| IsFinite | Check whether each element in the tensor is finite (not inf/NaN) | FP32 | - | - | FP16 | +| L2NormalizeFusion | L2 normalization for fusion optimization | FP32
Int8
UInt8 | - | - | | +| LayerNormFusion | Layer normalization for fusion optimization | FP16
FP32
Int8 | - | FP16
FP32 | FP16 | +| LayerNormGrad | Compute layer normalization gradients | FP16
FP32 | - | - | | +| LeakyReLU | Leaky ReLU activation function, which assigns a small slope to negative inputs. | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Less | Perform element-wise comparison between two tensors, returning a logical result indicating whether A < B. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| LessEqual | Perform element-wise comparison: A ≤ B, returns a Boolean tensor | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| LRN | Local response normalization | FP32 | - | - | FP16 | +| Log | Element-wise calculate the logarithm | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Log1p | Calculate log(1+X) | FP32 | - | - | FP16 | +| LogGrad | Calculate the gradient of the logarithmic function | FP16
FP32 | - | - | | +| LogicalAnd | Element-wise logical AND operation | FP16
FP32
Int32
Bool | FP16 | FP16
FP32 | | +| LogicalNot | Element-level logical NOT operation | FP16
FP32
Int8
UInt8
Bool | FP16 | FP16
FP32 | | +| LogicalOr | Element-wise logical OR operation | FP16
FP32
Bool | FP16 | FP16
FP32 | | +| LogSoftmax | Perform a softmax operation on the input vector, then take the logarithm of the softmax result. | FP16
FP32 | - | - | FP16 | +| LshProjection | Locality-sensitive hash projection | FP32 | - | - | | +| LSTM | Long-term and short-term memory network unit | FP16
FP32 | - | - | | +| LSTMGrad | Calculate the backward propagation gradient of the LSTM for the hidden state | FP32 | - | - | | +| LSTMGradData | Compute the backpropagation gradient of the LSTM for the input data | FP32 | - | - | | +| LSTMGradWeight | Calculate the backward propagation gradient of weights for the LSTM | FP32 | - | - | | +| MatMulFusion | Perform matrix multiplication on two inputs | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Maximum | Find the maximum value at the element level | FP16
FP32
Int32 | FP16 | FP16
FP32 | FP16 | +| MaximumGrad | Calculate the gradient of the maximum value function | FP16
FP32 | - | - | | +| MaxPoolFusion | Maximum pooling | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| MaxPoolGrad | Compute the gradients for the max-pooling layer | FP16
FP32 | - | - | | +| Merge | Create a new tensor with the exact same shape as the input tensor X, but with all element values set to 1. | FP16
FP32 | - | - | | +| Minimum | Find the minimum value at the element level | FP16
FP32
Int32 | FP16 | FP16
FP32 | FP16 | +| MinimumGrad | Compute the gradient of the minimum value function | FP16
FP32 | - | - | | +| Mod | Return the remainder of the division operation | FP32
Int32 | - | - | FP16 | +| MulFusion | Element-wise multiplication | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| MulGrad | Compute the gradient of the multiplication operation | FP32 | - | - | | +| Neg | Element-wise find negative numbers | FP16
FP32
Int32 | FP16 | FP16
FP32 | FP16 | +| NegGrad | Compute the gradient of the negation operation | FP16
FP32 | - | - | | +| NLLLoss | Compute the negative log-likelihood loss | FP32 | - | - | FP16 | +| NLLLossGrad | Compute the gradient of NLLLoss | FP32 | - | - | | +| NotEqual | Performs element-wise comparison between two tensors and returns the logical result indicating whether A != B. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | | +| NonMaxSuppression | Non-maximum suppression | FP32 | - | - | FP16 | +| NonZero | Return the indices of all non-zero elements in the input tensor. | Bool | - | - | FP16 | +| OneHot | Convert integer index tensors to one-hot encoding representations | FP16
FP32
Int32 | - | FP16
FP32
Int32 | | +| OnesLike | Create a new tensor with the exact same shape as the input tensor X, but with all element values set to 1. | FP16
FP32
Int32 | - | - | FP16 | +| PadFusion | Add specified padding to the input tensor, to achieve the desired size. | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| PartialFusion | Partial fusion | FP16
FP32
Int32
Bool | - | - | | +| PowFusion | Element-wise exponentiation | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 | +| PowerGrad | Compute the gradient of the power operation | FP32 | - | - | | +| PriorBox | Generate prior boxes | FP32
Int8
UInt8 | - | - | FP16 | +| PReLUFusion | PRelu activation function | FP16
FP32 | - | FP16
FP32 | FP16 | +| QuantDTypeCast | Perform quantitative data type conversion | FP16
FP32
Int8
UInt8 | - | - | | +| RaggedRange | Generate sequences with non-uniform intervals | FP16
FP32
Int32 | - | - | | +| RandomNormal | Generate a tensor whose values are randomly sampled from a normal distribution | FP16
FP32 | - | - | | +| RandomStandardNormal | Generate a random tensor following a standard normal distribution | FP16
FP32 | - | - | | +| Range | Generate elements within a specified range | FP16
FP32
Int32 | - | - | FP16 | +| Rank | Return the number of dimensions in the input tensor | FP16
FP32 | - | - | | +| RealDiv | Element-wise division | FP16
FP32 | - | - | FP16 | +| Reciprocal | Return reciprocals | FP16
FP32
Int8 | FP16 | - | FP16 | +| ReduceFusion | Reduction operation | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32 | FP16 | +| ReduceScatter | Distributed operations: Input tensors are segmented and distributed across devices, with each device retaining only one segment of the results. | FP32 | - | - | | +| Reshape | Changing the shape of a tensor while keeping the total number of elements unchanged | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | FP16 | +| Resize | Upsample or resize the input tensor | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | | +| ResizeGrad | Compute the gradient for Resize | FP16
FP32 | - | - | | +| ReverseV2 | Reverse the tensor along the specified axis | FP32
Int32 | - | - | | +| ReverseSequence | Partially reverse the variable-length sequence of the input tensor. | FP32 | - | - | FP16 | +| ROIPooling | Regional interest pooling | FP32 | - | - | FP16 | +| Round | Round to the nearest whole number | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Rsqrt | Element-wise compute square roots and reciprocals for normalization. | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | | +| RsqrtGrad | Calculate the gradient of the reciprocal of the square root | FP32 | - | - | | +| Select | Select elements from two tensors based on conditions | FP32
Bool | - | - | | +| Selu | Self-normalizing index linear unit activation function | - | - | - | | +| ScaleFusion | Fuse scaling operations with adjacent operators | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| ScatterNd | Scatter values from the input tensor to specified positions in the output tensor based on the index. | FP16
FP32
Int32 | - | - | FP16 | +| ScatterNdUpdate | Update the value of the input data using the given value and the input index. | FP16
FP32
Int32 | - | - | | +| SGD | Stochastic gradient descent optimizer | FP32 | - | - | FP16 | +| Shape | Obtain the tensor shape | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32 | FP16 | +| SigmoidCrossEntropyWithLogits | Combine Sigmoid activation and cross-entropy loss | FP32 | - | - | FP16 | +| SigmoidCrossEntropyWithLogitsGrad | Compute the gradient of the cross-entropy loss with sigmoid | FP32 | - | - | FP16 | +| Sin | Element-wise calculation of sine | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Size | Obtain tensor dimension size | FP16
FP32
Int32 | - | - | FP16 | +| SliceFusion | Tensor slicing operation | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| SkipGram | The core operation of the Skip-gram model, used for training word vectors | FP32 | - | - | | +| SmoothL1Loss | Smooth L1 Loss | FP32 | - | - | FP16 | +| SmoothL1LossGrad | Compute the gradient of the L1 loss | FP32 | - | - | | +| Softmax | Normalization operation | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| SoftmaxGrad | Calculate the gradient of Softmax | FP32 | - | - | | +| Softplus | Smooth ReLU variants | FP16
FP32 | - | - | FP16 | +| SpaceToBatch | Move the values of the height and width dimensions to the depth dimension. | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 | +| SpaceToBatchND | Split spatial-dimensional data blocks into batch dimensions | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | | +| SpaceToDepth | Reorganize spatial data into depth channels | FP16
FP32 | - | FP16
FP32 | | +| SparseToDense | Convert sparse representations to dense tensors | FP16
FP32
Int32 | - | FP16
FP32
Int32 | | +| SparseSoftmaxCrossEntropyWithLogits | Softmax cross-entropy for sparse labels | FP32 | - | - | FP16 | +| Splice | Connect multiple slices or ranges of the input tensor along the specified axis. | FP16
FP32 | - | - | | +| Split | Split the input tensor into multiple smaller output tensors along the specified axis. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| SplitWithOverlap | Overlapped split tensor | FP16
FP32 | - | - | | +| Sqrt | Element-wise take the square root | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| SqrtGrad | Calculate the gradient of the square root | FP32 | - | - | | +| Square | Element-wise square | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| SquaredDifference | Element-wise compute (A-B)² | FP16
FP32 | - | FP16
FP32 | | +| Squeeze | Remove dimension of size 1 | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32
Int32 | | +| StridedSlice | Tensor slicing | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| StridedSliceGrad | Compute the gradient of the slice operation | FP16
FP32 | - | - | | +| Stack | Stack multiple tensors along the new axis | FP16
FP32
Int32 | - | FP16
FP32 | FP16 | +| SubFusion | Element-wise subtraction | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| SubGrad | Calculate the gradient of subtraction | FP32 | - | - | | +| Switch | Select output branches based on Boolean conditions | FP16
FP32
Int32
Bool | - | - | | +| SwitchLayer | Select different subnetwork branches for execution within the model | FP16
FP32
Int32
Bool | - | - | | +| TensorListFromTensor | Convert a regular tensor into a list of tensors, splitting along the specified axis. | FP16
FP32
Int32 | - | - | | +| TensorListGetItem | Retrieve the tensor at the specified index position from the tensor list | FP16
FP32
Int32 | - | - | | +| TensorListReserve | Preallocate an empty array list, specifying the element data type and initial capacity. | FP16
FP32
Int32 | - | - | | +| TensorListSetItem | Insert a tensor into a specified position in a list of tensors | FP16
FP32
Int32 | - | - | | +| TensorListStack | Stack the list of tensors into a single regular tensor | FP16
FP32
Int32 | - | - | | +| TensorScatterAdd | Add the updated tensor values to the specified positions in the target tensor using the index. | FP32
Int32 | - | - | | +| TileFusion | Flatten the given matrix | FP16
FP32
Int32
Bool | FP16 | - | FP16 | +| TopKFusion | Return the top K elements from the input tensor. | FP16
FP32
Int32
Int8
UInt8 | - | - | FP16 | +| Transpose | Tensor transpose | FP16
FP32
Int32
Int8
Bool | FP16 | FP16
FP32 | FP16 | +| UniformReal | Generate a random tensor following a uniform distribution | FP32
Int32 | - | - | | +| Unique | Returns the unique values in the input tensor, along with their indices and count. | FP16
FP32
Int32 | - | - | | +| UnsortedSegmentSum | Perform segmented summation on the tensor without requiring ordered segmented indices. | FP16
FP32
Int32 | - | - | | +| Unsqueeze | Add a new dimension to the input tensor | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | | +| Unstack | Split a tensor into multiple sub-tensors along a specified axis | FP16
FP32
Int32 | - | - | | +| Where | Element selection | FP16
FP32
Int32
Bool | - | - | | +| ZerosLike | Generate a new tensor with the same shape as the input tensor but with all elements set to zero. | FP16
FP32
Int32 | - | - | | diff --git a/docs/lite/docs/source_zh_cn/reference/operator_list_lite.md b/docs/lite/docs/source_zh_cn/reference/operator_list_lite.md index d984a13f43..933c3beb4f 100644 --- a/docs/lite/docs/source_zh_cn/reference/operator_list_lite.md +++ b/docs/lite/docs/source_zh_cn/reference/operator_list_lite.md @@ -2,203 +2,203 @@ [![查看源文件](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source.svg)](https://gitee.com/mindspore/docs/blob/master/docs/lite/docs/source_zh_cn/reference/operator_list_lite.md) -| 算子名称 | 算子功能 | CPU | Kirin NPU | GPU(Mali/Adreno) | -| ----------------------------------- | ------------------------------------------------------------ | --------------------------------------------------- | --------- | ----------------------- | -| Abs | 逐元素计算绝对值 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| AbsGrad | 计算绝对值函数的梯度 | FP32 | - | - | -| Activation | 激活函数 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| ActivationGrad | 计算特定激活函数的梯度 | FP16
FP32 | - | - | -| Adam | 执行Adam优化器的一次参数更新步骤 | FP32 | - | - | -| AddFusion | 逐元素计算加法 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int8 | -| AdderFusion | 基于加法的卷积运算 | FP32 | - | - | -| AddGrad | 计算加法操作的梯度 | FP32 | - | - | -| AddN | 对N个相同形状和数据类型的输入张量进行逐元素相加 | FP16
FP32 | - | - | -| Affine | 对输入张量执行仿射变换 | FP32 | - | - | -| All | 判断张量中所有元素在指定维度上是否都为True(非零) | FP32 | - | - | -| AllGather | 分布式集合通信操作 | FP32 | - | - | -| ApplyMomentum | 执行带动量的随机梯度下降的一次参数更新步骤 | FP32 | - | - | -| Assert | 断言 | FP16
FP32
Bool | - | - | -| Assign | 将一个值赋值给一个变量 | FP32 | - | - | -| ArgmaxFusion | 求某一维度最大值 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| ArgminFusion | 求某一维度最小值 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | -| AvgPoolFusion | 平均池化 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| AvgPoolGrad | 计算平均池化层的梯度 | FP16
FP32 | - | - | -| BatchNorm | 批量归一化 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | -| BatchNormGrad | 计算批量归一化层的梯度 | FP16
FP32 | - | - | -| BatchToSpace | 空间到批次变换的逆操作 | FP32
Int8
UInt8 | - | FP16
FP32 | -| BatchToSpaceND | BatchToSpace的ND通用版本 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | -| BiasAdd | 将偏置向量添加到输入张量 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | -| BiasAddGrad | 计算BiasAdd操作的梯度 | FP16
FP32 | - | - | -| BinaryCrossEntropy | 计算二元交叉熵损失 | FP32 | - | - | -| BinaryCrossEntropyGrad | 计算二元交叉熵损失函数的梯度 | FP32 | - | - | -| BroadcastTo | 扩维 | FP16
FP32
Int32
Bool | - | - | -| Call | 调用一个子计算图或函数 | FP16
FP32
Int32
Bool | - | - | -| Cast | 数据类型转换 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32 | -| Ceil | 向上取整 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Clip | 限制元素范围 | FP32
Int32 | - | - | -| Concat | 拼接张量 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | -| ConstantOfShape | 生成一个与输入形状相同的张量,并用指定常量填充 | FP16
FP32
Int32 | - | - | -| Conv2DFusion | 2D卷积 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Conv2DBackpropFilterFusion | 计算普通卷积操作对卷积核的梯度 | FP16
FP32 | - | - | -| Conv2DBackpropInputFusion | 计算普通卷积操作对输入数据的梯度 | FP16
FP32 | - | - | -| Conv2dTransposeFusion | 执行转置卷积运算 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Cos | 逐元素计算余弦 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Crop | 从输入图像或特征图中裁剪出一个指定区域 | FP16
FP32
Int32
Int8
UInt8 | - | - | -| CropAndResize | 从输入图像中根据一组边界框裁剪出区域,然后将每个区域缩放到统一大小 | FP32 | FP16 | - | -| CumSum | 累计元素和 | FP32
Int32 | - | - | -| CustomExtractFeatures | 自定义特征提取算子 | FP32 | - | - | -| CustomNormalize | 自定义归一化算子 | FP32 | - | - | -| CustomPredict | 自定义预测算子 | FP32
Int32 | - | - | -| DEConv2DGradFilter | 计算转置卷积对卷积核的梯度 | FP32 | - | | -| DepthToSpace | 将深度数据重新排列到空间维度中 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | -| DetectionPostProcess | 目标检测后处理 | FP32
Int8
UInt8 | - | - | -| DivFusion | 逐元素除法 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| DivGrad | 计算除法操作的梯度 | FP32 | - | - | -| Dropout | 随机将输入张量的部分元素置 0 | FP16
FP32 | - | - | -| DropoutGrad | 计算Dropout操作的梯度 | FP16
FP32 | - | - | -| DynamicQuant | 动态将浮点张量量化为uint8类型 | FP32 | - | - | -| Eltwise | 元素级运算 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Elu | 激活函数,对负输入使用指数修正 | FP16
FP32 | - | - | -| Equal | 判断输入是否相等 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| EmbeddingLookupFusion | 优化版的词嵌入查找,将整数索引映射为密集向量 | FP32 | - | - | -| Erf | 误差函数 | FP16
FP32 | - | - | -| ExpFusion | 逐元素取指数 | FP16
FP32 | - | FP16
FP32 | -| ExpandDims | 在指定位置插入长度为1的维度 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | -| Fill | 生成一个填充指定常量的张量 | FP16
FP32
Int32
Bool | - | FP16
FP32 | -| Flatten | 数据按维度展开 | FP16
FP32
Int32 | - | - | -| FlattenGrad | 计算Flatten操作的梯度 | FP16
FP32 | - | - | -| Floor | 向下取整 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| FloorDiv | 逐元素向下取整除法 | FP16
FP32
Int32 | FP16 | FP16
FP32 | -| FloorMod | 逐元素取模运算,结果的符号与除数一致 | FP16
FP32
Int32 | FP16 | FP16
FP32 | -| FullConnection | 全连接层 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| FusedBatchNorm | 对输入做标准化 | FP16
FP32
Int8
UInt8 | FP16 | - | -| GatherNd | 根据索引张量从输入张量中收集指定位置的元素 | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32 | -| Gather | 沿单一维度收集指定索引位置的元素 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | -| GatherD | 将输入tensor中的元素根据索引tensor进行收集 | FP16
FP32
Int32
Bool | - | - | -| GLU | 门控线性单元激活函数,将输入拆分为两部分并逐元素相乘 | FP32 | - | - | -| Greater | 逐元素比较两个张量,返回A>B的逻辑结果(True/False) | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| GreaterEqual | 逐元素比较两个张量,返回 A≥B的逻辑结果(True/False) | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| GroupNormFusion | 融合优化的组归一化 | FP32 | - | - | -| GRU | 门控循环单元,简化版LSTM | FP16
FP32 | - | - | -| HashtableLookup | 哈希表查找 | FP32
Int32 | - | - | -| InstanceNorm | 实例归一化 | FP16
FP32 | FP16 | - | -| InvertPermutation | 反转置换索引 | FP16
FP32
Int32 | - | - | -| IsFinite | 检测张量中每个元素是否为有限值(非inf/NaN) | FP32 | - | - | -| L2NormalizeFusion | 融合优化的L2归一化 | FP32
Int8
UInt8 | - | - | -| LayerNormFusion | 融合优化的层归一化 | FP16
FP32
Int8 | - | FP16
FP32 | -| LayerNormGrad | 计算层归一化的梯度 | FP16
FP32 | - | - | -| LeakyReLU | 带泄漏的 ReLU激活函数,对负输入给予微小斜率 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Less | 逐元素比较两个张量,返回 AFP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| LessEqual | 逐元素比较A ≤ B,返回布尔张量 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| LRN | 局部响应归一化 | FP32 | - | - | -| Log | 逐元素求对数 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Log1p | 计算log(1+X) | FP32 | - | - | -| LogGrad | 计算对数函数的梯度 | FP16
FP32 | - | - | -| LogicalAnd | 逐元素逻辑与运算 | FP16
FP32
Int32
Bool | FP16 | FP16
FP32 | -| LogicalNot | 逐元素逻辑非运算 | FP16
FP32
Int8
UInt8
Bool | FP16 | FP16
FP32 | -| LogicalOr | 逐元素逻辑或运算 | FP16
FP32
Bool | FP16 | FP16
FP32 | -| LogSoftmax | 对输入向量进行softmax操作,然后再对softmax结果取对数 | FP16
FP32 | - | - | -| LshProjection | 局部敏感哈希投影 | FP32 | - | - | -| LSTM | 长短期记忆网络单元 | FP16
FP32 | - | - | -| LSTMGrad | 计算LSTM对隐状态的反向传播梯度 | FP32 | - | - | -| LSTMGradData | 计算LSTM对输入数据的反向传播梯度 | FP32 | - | - | -| LSTMGradWeight | 计算LSTM对权重的反向传播梯度 | FP32 | - | - | -| MatMulFusion | 对2个输入做矩阵乘法运算 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Maximum | 取元素级最大值 | FP16
FP32
Int32 | FP16 | FP16
FP32 | -| MaximumGrad | 计算最大值函数的梯度 | FP16
FP32 | - | - | -| MaxPoolFusion | 最大池化 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| MaxPoolGrad | 计算最大池化层的梯度 | FP16
FP32 | - | - | -| Merge | 创建一个与输入张量X形状完全相同但所有元素值均为1的新张量 | FP16
FP32 | - | - | -| Minimum | 取元素级最小值 | FP16
FP32
Int32 | FP16 | FP16
FP32 | -| MinimumGrad | 计算最小值函数的梯度 | FP16
FP32 | - | - | -| Mod | 返回除法元素的余数 | FP32
Int32 | - | - | -| MulFusion | 逐元素乘法 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| MulGrad | 计算乘法操作的梯度 | FP32 | - | - | -| Neg | 逐元素求负数 | FP16
FP32
Int32 | FP16 | FP16
FP32 | -| NegGrad | 计算取负操作的梯度 | FP16
FP32 | - | - | -| NLLLoss | 计算负对数似然损失 | FP32 | - | - | -| NLLLossGrad | 计算NLLLoss的梯度 | FP32 | - | - | -| NotEqual | 逐元素比较两个张量,返回 A != B的逻辑结果 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| NonMaxSuppression | 非极大值抑制 | FP32 | - | - | -| NonZero | 返回输入张量中所有非零元素的索引 | Bool | - | - | -| OneHot | 将整数索引张量转换为独热编码表示 | FP16
FP32
Int32 | - | FP16
FP32
Int32 | -| OnesLike | 创建一个与输入张量 X形状完全相同但所有元素值均为1的新张量 | FP16
FP32
Int32 | - | - | -| PadFusion | 将输入张量加上指定的padding,使其达到指定的大小 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| PartialFusion | 部分融合 | FP16
FP32
Int32
Bool | - | - | -| PowFusion | 逐元素求幂 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | -| PowerGrad | 计算幂运算的梯度 | FP32 | - | - | -| PriorBox | 生成先验框 | FP32
Int8
UInt8 | - | - | -| PReLUFusion | PRelu激活函数 | FP16
FP32 | - | FP16
FP32 | -| QuantDTypeCast | 执行量化数据类型转换 | FP16
FP32
Int8
UInt8 | - | - | -| RaggedRange | 生成非均匀间隔的序列 | FP16
FP32
Int32 | - | - | -| RandomNormal | 生成一个张量,其中的值从正态分布中随机采样 | FP16
FP32 | - | - | -| RandomStandardNormal | 生成服从标准正态分布的随机数张量 | FP16
FP32 | - | - | -| Range | 生成某个区间内的元素 | FP16
FP32
Int32 | - | - | -| Rank | 返回输入张量的维度数 | FP16
FP32 | - | - | -| RealDiv | 逐元素除法 | FP16
FP32 | - | - | -| Reciprocal | 返回倒数 | FP16
FP32
Int8 | FP16 | - | -| ReduceFusion | 归约操作 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32 | -| ReduceScatter | 分布式操作,将输入张量分段后分发到各设备,每设备仅保留一段结果 | FP32 | - | - | -| Reshape | 改变张量形状,总元素个数不变 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | -| Resize | 对输入张量进行上采样或调整大小 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| ResizeGrad | 计算Resize的梯度 | FP16
FP32 | - | - | -| ReverseV2 | 沿指定轴反转张量 | FP32
Int32 | - | - | -| ReverseSequence | 对输入张量的可变长度序列进行部分反转 | FP32 | - | - | -| ROIPooling | 区域兴趣池化 | FP32 | - | - | -| Round | 四舍五入到最接近的整数数值 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Rsqrt | 逐元素计算平方根倒数,用于归一化 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| RsqrtGrad | 计算平方根倒数的梯度 | FP32 | - | - | -| Select | 根据条件从两个张量中选择元素 | FP32
Bool | - | - | -| Selu | 自归一化指数线性单元激活函数 | - | - | - | -| ScaleFusion | 将缩放操作与相邻算子融合 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| ScatterNd | 根据索引将更新张量中的值散射到输出张量的指定位置 | FP16
FP32
Int32 | - | - | -| ScatterNdUpdate | 使用给定值以及输入索引更新输入数据的值 | FP16
FP32
Int32 | - | - | -| SGD | 随机梯度下降优化器 | FP32 | - | - | -| Shape | 获得张量shape | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32 | -| SigmoidCrossEntropyWithLogits | 结合Sigmoid激活和交叉熵损失 | FP32 | - | - | -| SigmoidCrossEntropyWithLogitsGrad | 计算带Sigmoid的交叉熵损失的梯度 | FP32 | - | - | -| Sin | 逐元素计算正弦 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| Size | 获取张量维度大小 | FP16
FP32
Int32 | - | - | -| SliceFusion | 张量切片操作 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| SkipGram | Skip-gram模型的核心操作,用于词向量训练 | FP32 | - | - | -| SmoothL1Loss | 平滑L1损失 | FP32 | - | - | -| SmoothL1LossGrad | 计算平滑L1损失的梯度 | FP32 | - | - | -| Softmax | 归一化操作 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| SoftmaxGrad | 计算Softmax的梯度 | FP32 | - | - | -| Softplus | 平滑的ReLU变体 | FP16
FP32 | - | - | -| SpaceToBatch | 高度和宽度维度的值移至深度维度 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | -| SpaceToBatchND | 将空间维度的数据块拆分到批次维度 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | -| SpaceToDepth | 将空间数据重组为深度通道 | FP16
FP32 | - | FP16
FP32 | -| SparseToDense | 将稀疏表示转换为密集张量 | FP16
FP32
Int32 | - | FP16
FP32
Int32 | -| SparseSoftmaxCrossEntropyWithLogits | 稀疏标签的Softmax交叉熵 | FP32 | - | - | -| Splice | 沿指定轴连接输入张量的多个切片或范围 | FP16
FP32 | - | - | -| Split | 将输入张量沿指定轴分割成多个较小的输出张量 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| SplitWithOverlap | 带重叠的分割张量 | FP16
FP32 | - | - | -| Sqrt | 逐元素开根号 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| SqrtGrad | 计算平方根的梯度 | FP32 | - | - | -| Square | 逐元素平方 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | -| SquaredDifference | 逐元素计算 (A-B)² | FP16
FP32 | - | FP16
FP32 | -| Squeeze | 移除维度为1的维度 | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32
Int32 | -| StridedSlice | Tensor切片 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| StridedSliceGrad | 计算切片操作的梯度 | FP16
FP32 | - | - | -| Stack | 沿新轴堆叠多个张量 | FP16
FP32
Int32 | - | FP16
FP32 | -| SubFusion | 逐元素相减 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | -| SubGrad | 计算减法的梯度 | FP32 | - | - | -| Switch | 根据布尔条件选择输出分支 | FP16
FP32
Int32
Bool | - | - | -| SwitchLayer | 在模型中选择执行不同的子网络分支 | FP16
FP32
Int32
Bool | - | - | -| TensorListFromTensor | 将普通张量转换为张量列表,按指定轴分割 | FP16
FP32
Int32 | - | - | -| TensorListGetItem | 从张量列表中获取指定索引位置的张量 | FP16
FP32
Int32 | - | - | -| TensorListReserve | 预分配一个空张量列表,指定元素数据类型和初始容量 | FP16
FP32
Int32 | - | - | -| TensorListSetItem | 将张量插入张量列表的指定位置 | FP16
FP32
Int32 | - | - | -| TensorListStack | 将张量列表堆叠为一个普通张量 | FP16
FP32
Int32 | - | - | -| TensorScatterAdd | 根据索引将更新张量的值分散添加到目标张量的指定位置 | FP32
Int32 | - | - | -| TileFusion | 平铺给定矩阵 | FP16
FP32
Int32
Bool | FP16 | - | -| TopKFusion | 从输入张量中返回topK个元素 | FP16
FP32
Int32
Int8
UInt8 | - | - | -| Transpose | Tensor转置 | FP16
FP32
Int32
Int8
Bool | FP16 | FP16
FP32 | -| UniformReal | 生成服从均匀分布的随机数张量 | FP32
Int32 | - | - | -| Unique | 返回输入张量中的唯一值,并可返回该值的索引和计数 | FP16
FP32
Int32 | - | - | -| UnsortedSegmentSum | 对张量进行分段求和,不要求分段索引有序 | FP16
FP32
Int32 | - | - | -| Unsqueeze | 将输入张量添加一个新的维度 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | -| Unstack | 沿指定轴拆分张量为多个子张量 | FP16
FP32
Int32 | - | - | -| Where | 元素选择 | FP16
FP32
Int32
Bool | - | - | -| ZerosLike | 生成与输入张量形状相同但全为 0的新张量 | FP16
FP32
Int32 | - | - | +| 算子名称 | 算子功能 | CPU | Kirin NPU | GPU(Mali/Adreno) | Ascend | +| ----------------------------------- | ------------------------------------------------------------ | --------------------------------------------------- | --------- | ----------------------- | ----------------------- | +| Abs | 逐元素计算绝对值 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| AbsGrad | 计算绝对值函数的梯度 | FP32 | - | - | | +| Activation | 激活函数 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| ActivationGrad | 计算特定激活函数的梯度 | FP16
FP32 | - | - | | +| Adam | 执行Adam优化器的一次参数更新步骤 | FP32 | - | - | | +| AddFusion | 逐元素计算加法 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int8 | FP16 | +| AdderFusion | 基于加法的卷积运算 | FP32 | - | - | | +| AddGrad | 计算加法操作的梯度 | FP32 | - | - | | +| AddN | 对N个相同形状和数据类型的输入张量进行逐元素相加 | FP16
FP32 | - | - | | +| Affine | 对输入张量执行仿射变换 | FP32 | - | - | FP16 | +| All | 判断张量中所有元素在指定维度上是否都为True(非零) | FP32 | - | - | | +| AllGather | 分布式集合通信操作 | FP32 | - | - | | +| ApplyMomentum | 执行带动量的随机梯度下降的一次参数更新步骤 | FP32 | - | - | FP16 | +| Assert | 断言 | FP16
FP32
Bool | - | - | | +| Assign | 将一个值赋值给一个变量 | FP32 | - | - | FP16 | +| ArgmaxFusion | 求某一维度最大值 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| ArgminFusion | 求某一维度最小值 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 | +| AvgPoolFusion | 平均池化 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| AvgPoolGrad | 计算平均池化层的梯度 | FP16
FP32 | - | - | | +| BatchNorm | 批量归一化 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 | +| BatchNormGrad | 计算批量归一化层的梯度 | FP16
FP32 | - | - | | +| BatchToSpace | 空间到批次变换的逆操作 | FP32
Int8
UInt8 | - | FP16
FP32 | | +| BatchToSpaceND | BatchToSpace的ND通用版本 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | | +| BiasAdd | 将偏置向量添加到输入张量 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 | +| BiasAddGrad | 计算BiasAdd操作的梯度 | FP16
FP32 | - | - | | +| BinaryCrossEntropy | 计算二元交叉熵损失 | FP32 | - | - | FP16 | +| BinaryCrossEntropyGrad | 计算二元交叉熵损失函数的梯度 | FP32 | - | - | | +| BroadcastTo | 扩维 | FP16
FP32
Int32
Bool | - | - | | +| Call | 调用一个子计算图或函数 | FP16
FP32
Int32
Bool | - | - | FP16 | +| Cast | 数据类型转换 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32 | FP16 | +| Ceil | 向上取整 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Clip | 限制元素范围 | FP32
Int32 | - | - | FP16 | +| Concat | 拼接张量 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | FP16 | +| ConstantOfShape | 生成一个与输入形状相同的张量,并用指定常量填充 | FP16
FP32
Int32 | - | - | | +| Conv2DFusion | 2D卷积 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Conv2DBackpropFilterFusion | 计算普通卷积操作对卷积核的梯度 | FP16
FP32 | - | - | | +| Conv2DBackpropInputFusion | 计算普通卷积操作对输入数据的梯度 | FP16
FP32 | - | - | | +| Conv2dTransposeFusion | 执行转置卷积运算 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Cos | 逐元素计算余弦 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Crop | 从输入图像或特征图中裁剪出一个指定区域 | FP16
FP32
Int32
Int8
UInt8 | - | - | | +| CropAndResize | 从输入图像中根据一组边界框裁剪出区域,然后将每个区域缩放到统一大小 | FP32 | FP16 | - | | +| CumSum | 累计元素和 | FP32
Int32 | - | - | FP16 | +| CustomExtractFeatures | 自定义特征提取算子 | FP32 | - | - | | +| CustomNormalize | 自定义归一化算子 | FP32 | - | - | | +| CustomPredict | 自定义预测算子 | FP32
Int32 | - | - | | +| DEConv2DGradFilter | 计算转置卷积对卷积核的梯度 | FP32 | - | | | +| DepthToSpace | 将深度数据重新排列到空间维度中 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | | +| DetectionPostProcess | 目标检测后处理 | FP32
Int8
UInt8 | - | - | | +| DivFusion | 逐元素除法 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| DivGrad | 计算除法操作的梯度 | FP32 | - | - | | +| Dropout | 随机将输入张量的部分元素置 0 | FP16
FP32 | - | - | FP16 | +| DropoutGrad | 计算Dropout操作的梯度 | FP16
FP32 | - | - | | +| DynamicQuant | 动态将浮点张量量化为uint8类型 | FP32 | - | - | | +| Eltwise | 元素级运算 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Elu | 激活函数,对负输入使用指数修正 | FP16
FP32 | - | - | FP16 | +| Equal | 判断输入是否相等 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| EmbeddingLookupFusion | 优化版的词嵌入查找,将整数索引映射为密集向量 | FP32 | - | - | | +| Erf | 误差函数 | FP16
FP32 | - | - | FP16 | +| ExpFusion | 逐元素取指数 | FP16
FP32 | - | FP16
FP32 | FP16 | +| ExpandDims | 在指定位置插入长度为1的维度 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | FP16 | +| Fill | 生成一个填充指定常量的张量 | FP16
FP32
Int32
Bool | - | FP16
FP32 | FP16 | +| Flatten | 数据按维度展开 | FP16
FP32
Int32 | - | - | FP16 | +| FlattenGrad | 计算Flatten操作的梯度 | FP16
FP32 | - | - | | +| Floor | 向下取整 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| FloorDiv | 逐元素向下取整除法 | FP16
FP32
Int32 | FP16 | FP16
FP32 | | +| FloorMod | 逐元素取模运算,结果的符号与除数一致 | FP16
FP32
Int32 | FP16 | FP16
FP32 | | +| FullConnection | 全连接层 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| FusedBatchNorm | 对输入做标准化 | FP16
FP32
Int8
UInt8 | FP16 | - | FP16 | +| GatherNd | 根据索引张量从输入张量中收集指定位置的元素 | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32 | FP16 | +| Gather | 沿单一维度收集指定索引位置的元素 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | FP16 | +| GatherD | 将输入tensor中的元素根据索引tensor进行收集 | FP16
FP32
Int32
Bool | - | - | FP16 | +| GLU | 门控线性单元激活函数,将输入拆分为两部分并逐元素相乘 | FP32 | - | - | | +| Greater | 逐元素比较两个张量,返回A>B的逻辑结果(True/False) | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| GreaterEqual | 逐元素比较两个张量,返回 A≥B的逻辑结果(True/False) | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| GroupNormFusion | 融合优化的组归一化 | FP32 | - | - | | +| GRU | 门控循环单元,简化版LSTM | FP16
FP32 | - | - | | +| HashtableLookup | 哈希表查找 | FP32
Int32 | - | - | | +| InstanceNorm | 实例归一化 | FP16
FP32 | FP16 | - | FP16 | +| InvertPermutation | 反转置换索引 | FP16
FP32
Int32 | - | - | | +| IsFinite | 检测张量中每个元素是否为有限值(非inf/NaN) | FP32 | - | - | FP16 | +| L2NormalizeFusion | 融合优化的L2归一化 | FP32
Int8
UInt8 | - | - | | +| LayerNormFusion | 融合优化的层归一化 | FP16
FP32
Int8 | - | FP16
FP32 | FP16 | +| LayerNormGrad | 计算层归一化的梯度 | FP16
FP32 | - | - | | +| LeakyReLU | 带泄漏的 ReLU激活函数,对负输入给予微小斜率 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Less | 逐元素比较两个张量,返回 AFP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| LessEqual | 逐元素比较A ≤ B,返回布尔张量 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| LRN | 局部响应归一化 | FP32 | - | - | FP16 | +| Log | 逐元素求对数 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Log1p | 计算log(1+X) | FP32 | - | - | FP16 | +| LogGrad | 计算对数函数的梯度 | FP16
FP32 | - | - | | +| LogicalAnd | 逐元素逻辑与运算 | FP16
FP32
Int32
Bool | FP16 | FP16
FP32 | | +| LogicalNot | 逐元素逻辑非运算 | FP16
FP32
Int8
UInt8
Bool | FP16 | FP16
FP32 | | +| LogicalOr | 逐元素逻辑或运算 | FP16
FP32
Bool | FP16 | FP16
FP32 | | +| LogSoftmax | 对输入向量进行softmax操作,然后再对softmax结果取对数 | FP16
FP32 | - | - | FP16 | +| LshProjection | 局部敏感哈希投影 | FP32 | - | - | | +| LSTM | 长短期记忆网络单元 | FP16
FP32 | - | - | | +| LSTMGrad | 计算LSTM对隐状态的反向传播梯度 | FP32 | - | - | | +| LSTMGradData | 计算LSTM对输入数据的反向传播梯度 | FP32 | - | - | | +| LSTMGradWeight | 计算LSTM对权重的反向传播梯度 | FP32 | - | - | | +| MatMulFusion | 对2个输入做矩阵乘法运算 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Maximum | 取元素级最大值 | FP16
FP32
Int32 | FP16 | FP16
FP32 | FP16 | +| MaximumGrad | 计算最大值函数的梯度 | FP16
FP32 | - | - | | +| MaxPoolFusion | 最大池化 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| MaxPoolGrad | 计算最大池化层的梯度 | FP16
FP32 | - | - | | +| Merge | 创建一个与输入张量X形状完全相同但所有元素值均为1的新张量 | FP16
FP32 | - | - | | +| Minimum | 取元素级最小值 | FP16
FP32
Int32 | FP16 | FP16
FP32 | FP16 | +| MinimumGrad | 计算最小值函数的梯度 | FP16
FP32 | - | - | | +| Mod | 返回除法元素的余数 | FP32
Int32 | - | - | FP16 | +| MulFusion | 逐元素乘法 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| MulGrad | 计算乘法操作的梯度 | FP32 | - | - | | +| Neg | 逐元素求负数 | FP16
FP32
Int32 | FP16 | FP16
FP32 | FP16 | +| NegGrad | 计算取负操作的梯度 | FP16
FP32 | - | - | | +| NLLLoss | 计算负对数似然损失 | FP32 | - | - | FP16 | +| NLLLossGrad | 计算NLLLoss的梯度 | FP32 | - | - | | +| NotEqual | 逐元素比较两个张量,返回 A != B的逻辑结果 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | | +| NonMaxSuppression | 非极大值抑制 | FP32 | - | - | FP16 | +| NonZero | 返回输入张量中所有非零元素的索引 | Bool | - | - | FP16 | +| OneHot | 将整数索引张量转换为独热编码表示 | FP16
FP32
Int32 | - | FP16
FP32
Int32 | | +| OnesLike | 创建一个与输入张量 X形状完全相同但所有元素值均为1的新张量 | FP16
FP32
Int32 | - | - | FP16 | +| PadFusion | 将输入张量加上指定的padding,使其达到指定的大小 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| PartialFusion | 部分融合 | FP16
FP32
Int32
Bool | - | - | | +| PowFusion | 逐元素求幂 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 | +| PowerGrad | 计算幂运算的梯度 | FP32 | - | - | | +| PriorBox | 生成先验框 | FP32
Int8
UInt8 | - | - | FP16 | +| PReLUFusion | PRelu激活函数 | FP16
FP32 | - | FP16
FP32 | FP16 | +| QuantDTypeCast | 执行量化数据类型转换 | FP16
FP32
Int8
UInt8 | - | - | | +| RaggedRange | 生成非均匀间隔的序列 | FP16
FP32
Int32 | - | - | | +| RandomNormal | 生成一个张量,其中的值从正态分布中随机采样 | FP16
FP32 | - | - | | +| RandomStandardNormal | 生成服从标准正态分布的随机数张量 | FP16
FP32 | - | - | | +| Range | 生成某个区间内的元素 | FP16
FP32
Int32 | - | - | FP16 | +| Rank | 返回输入张量的维度数 | FP16
FP32 | - | - | | +| RealDiv | 逐元素除法 | FP16
FP32 | - | - | FP16 | +| Reciprocal | 返回倒数 | FP16
FP32
Int8 | FP16 | - | FP16 | +| ReduceFusion | 归约操作 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32 | FP16 | +| ReduceScatter | 分布式操作,将输入张量分段后分发到各设备,每设备仅保留一段结果 | FP32 | - | - | | +| Reshape | 改变张量形状,总元素个数不变 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | FP16 | +| Resize | 对输入张量进行上采样或调整大小 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | | +| ResizeGrad | 计算Resize的梯度 | FP16
FP32 | - | - | | +| ReverseV2 | 沿指定轴反转张量 | FP32
Int32 | - | - | | +| ReverseSequence | 对输入张量的可变长度序列进行部分反转 | FP32 | - | - | FP16 | +| ROIPooling | 区域兴趣池化 | FP32 | - | - | FP16 | +| Round | 四舍五入到最接近的整数数值 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Rsqrt | 逐元素计算平方根倒数,用于归一化 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | | +| RsqrtGrad | 计算平方根倒数的梯度 | FP32 | - | - | | +| Select | 根据条件从两个张量中选择元素 | FP32
Bool | - | - | | +| Selu | 自归一化指数线性单元激活函数 | - | - | - | | +| ScaleFusion | 将缩放操作与相邻算子融合 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| ScatterNd | 根据索引将更新张量中的值散射到输出张量的指定位置 | FP16
FP32
Int32 | - | - | FP16 | +| ScatterNdUpdate | 使用给定值以及输入索引更新输入数据的值 | FP16
FP32
Int32 | - | - | | +| SGD | 随机梯度下降优化器 | FP32 | - | - | FP16 | +| Shape | 获得张量shape | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32 | FP16 | +| SigmoidCrossEntropyWithLogits | 结合Sigmoid激活和交叉熵损失 | FP32 | - | - | FP16 | +| SigmoidCrossEntropyWithLogitsGrad | 计算带Sigmoid的交叉熵损失的梯度 | FP32 | - | - | FP16 | +| Sin | 逐元素计算正弦 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| Size | 获取张量维度大小 | FP16
FP32
Int32 | - | - | FP16 | +| SliceFusion | 张量切片操作 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| SkipGram | Skip-gram模型的核心操作,用于词向量训练 | FP32 | - | - | | +| SmoothL1Loss | 平滑L1损失 | FP32 | - | - | FP16 | +| SmoothL1LossGrad | 计算平滑L1损失的梯度 | FP32 | - | - | | +| Softmax | 归一化操作 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| SoftmaxGrad | 计算Softmax的梯度 | FP32 | - | - | | +| Softplus | 平滑的ReLU变体 | FP16
FP32 | - | - | FP16 | +| SpaceToBatch | 高度和宽度维度的值移至深度维度 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 | +| SpaceToBatchND | 将空间维度的数据块拆分到批次维度 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | | +| SpaceToDepth | 将空间数据重组为深度通道 | FP16
FP32 | - | FP16
FP32 | | +| SparseToDense | 将稀疏表示转换为密集张量 | FP16
FP32
Int32 | - | FP16
FP32
Int32 | | +| SparseSoftmaxCrossEntropyWithLogits | 稀疏标签的Softmax交叉熵 | FP32 | - | - | FP16 | +| Splice | 沿指定轴连接输入张量的多个切片或范围 | FP16
FP32 | - | - | | +| Split | 将输入张量沿指定轴分割成多个较小的输出张量 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| SplitWithOverlap | 带重叠的分割张量 | FP16
FP32 | - | - | | +| Sqrt | 逐元素开根号 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| SqrtGrad | 计算平方根的梯度 | FP32 | - | - | | +| Square | 逐元素平方 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| SquaredDifference | 逐元素计算 (A-B)² | FP16
FP32 | - | FP16
FP32 | | +| Squeeze | 移除维度为1的维度 | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32
Int32 | | +| StridedSlice | Tensor切片 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| StridedSliceGrad | 计算切片操作的梯度 | FP16
FP32 | - | - | | +| Stack | 沿新轴堆叠多个张量 | FP16
FP32
Int32 | - | FP16
FP32 | FP16 | +| SubFusion | 逐元素相减 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 | +| SubGrad | 计算减法的梯度 | FP32 | - | - | | +| Switch | 根据布尔条件选择输出分支 | FP16
FP32
Int32
Bool | - | - | | +| SwitchLayer | 在模型中选择执行不同的子网络分支 | FP16
FP32
Int32
Bool | - | - | | +| TensorListFromTensor | 将普通张量转换为张量列表,按指定轴分割 | FP16
FP32
Int32 | - | - | | +| TensorListGetItem | 从张量列表中获取指定索引位置的张量 | FP16
FP32
Int32 | - | - | | +| TensorListReserve | 预分配一个空张量列表,指定元素数据类型和初始容量 | FP16
FP32
Int32 | - | - | | +| TensorListSetItem | 将张量插入张量列表的指定位置 | FP16
FP32
Int32 | - | - | | +| TensorListStack | 将张量列表堆叠为一个普通张量 | FP16
FP32
Int32 | - | - | | +| TensorScatterAdd | 根据索引将更新张量的值分散添加到目标张量的指定位置 | FP32
Int32 | - | - | | +| TileFusion | 平铺给定矩阵 | FP16
FP32
Int32
Bool | FP16 | - | FP16 | +| TopKFusion | 从输入张量中返回topK个元素 | FP16
FP32
Int32
Int8
UInt8 | - | - | FP16 | +| Transpose | Tensor转置 | FP16
FP32
Int32
Int8
Bool | FP16 | FP16
FP32 | FP16 | +| UniformReal | 生成服从均匀分布的随机数张量 | FP32
Int32 | - | - | | +| Unique | 返回输入张量中的唯一值,并可返回该值的索引和计数 | FP16
FP32
Int32 | - | - | | +| UnsortedSegmentSum | 对张量进行分段求和,不要求分段索引有序 | FP16
FP32
Int32 | - | - | | +| Unsqueeze | 将输入张量添加一个新的维度 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | | +| Unstack | 沿指定轴拆分张量为多个子张量 | FP16
FP32
Int32 | - | - | | +| Where | 元素选择 | FP16
FP32
Int32
Bool | - | - | | +| ZerosLike | 生成与输入张量形状相同但全为 0的新张量 | FP16
FP32
Int32 | - | - | | -- Gitee