diff --git a/docs/lite/docs/source_en/reference/operator_list_lite.md b/docs/lite/docs/source_en/reference/operator_list_lite.md
index fb35b2144996994db5414254d8dd9e18bc6b887a..d0a0a1af0720ffa29f3495e2dd206e6e620fa916 100644
--- a/docs/lite/docs/source_en/reference/operator_list_lite.md
+++ b/docs/lite/docs/source_en/reference/operator_list_lite.md
@@ -2,203 +2,203 @@
[](https://gitee.com/mindspore/docs/blob/master/docs/lite/docs/source_en/reference/operator_list_lite.md)
-| Operator Names | Operator Functions | CPU | Kirin NPU | GPU (Mali/Adreno) |
-| ----------------------------------- | ------------------------------------------------------------ | --------------------------------------------------- | --------- | ----------------------- |
-| Abs | Element-wise calculate the absolute value | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| AbsGrad | Compute the gradient of the absolute value function | FP32 | - | - |
-| Activation | Activation functions | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| ActivationGrad | Calculate the gradient of a specific activation function | FP16
FP32 | - | - |
-| Adam | Executing a single parameter update step of the Adam optimizer | FP32 | - | - |
-| AddFusion | Element-wise addition computation | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int8 |
-| AdderFusion | Addition-based convolution operation | FP32 | - | - |
-| AddGrad | Compute the gradient of the addition operation | FP32 | - | - |
-| AddN | Perform element-wise addition on N input tensors of identical shape and data type. | FP16
FP32 | - | - |
-| Affine | Perform an affine transformation on the input tensor. | FP32 | - | - |
-| All | Determine whether all elements in the tensor are True (non-zero) along the specified dimension. | FP32 | - | - |
-| AllGather | Distributed collection communication operations | FP32 | - | - |
-| ApplyMomentum | Execute a single parameter update step of stochastic gradient descent for momentum. | FP32 | - | - |
-| Assert | Assertion | FP16
FP32
Bool | - | - |
-| Assign | Assign a value to a variable | FP32 | - | - |
-| ArgmaxFusion | Find the maximum value in a given dimension | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| ArgminFusion | Find the minimum value in a given dimension | FP16
FP32
Int8
UInt8 | - | FP16
FP32 |
-| AvgPoolFusion | Average pooling | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| AvgPoolGrad | Compute the gradients for the average pooling layer | FP16
FP32 | - | - |
-| BatchNorm | Batch normalization | FP16
FP32
Int8
UInt8 | - | FP16
FP32 |
-| BatchNormGrad | Compute the gradient of the batch normalization layer | FP16
FP32 | - | - |
-| BatchToSpace | Inverse operation of space-to-batch transformation | FP32
Int8
UInt8 | - | FP16
FP32 |
-| BatchToSpaceND | ND universal version of BatchToSpace | FP16
FP32
Int8
UInt8 | - | FP16
FP32 |
-| BiasAdd | Add the bias vector to the input tensor | FP16
FP32
Int8
UInt8 | - | FP16
FP32 |
-| BiasAddGrad | The gradient of the BiasAdd operation | FP16
FP32 | - | - |
-| BinaryCrossEntropy | Calculate the binary cross-entropy loss | FP32 | - | - |
-| BinaryCrossEntropyGrad | Calculate the gradient of the binary cross-entropy loss function | FP32 | - | - |
-| BroadcastTo | Expansion of dimensions | FP16
FP32
Int32
Bool | - | - |
-| Call | Call a subgraph or function | FP16
FP32
Int32
Bool | - | - |
-| Cast | Data type conversion | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32 |
-| Ceil | Round up to the nearest integer | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Clip | Restrict element ranges | FP32
Int32 | - | - |
-| Concat | Concatenated Tensor | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 |
-| ConstantOfShape | Generate a tensor with the same shape as the input and fill it with the specified constant. | FP16
FP32
Int32 | - | - |
-| Conv2DFusion | 2D convolution | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Conv2DBackpropFilterFusion | Compute the gradient of the convolution kernel with respect to the ordinary convolution operation. | FP16
FP32 | - | - |
-| Conv2DBackpropInputFusion | Compute the gradient of the input data with respect to the standard convolution operation. | FP16
FP32 | - | - |
-| Conv2dTransposeFusion | Perform transposed convolution operations | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Cos | Element-wise cosine calculation | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Crop | Crop a specified region from an input image or feature map. | FP16
FP32
Int32
Int8
UInt8 | - | - |
-| CropAndResize | Crop regions from the input image based on a set of bounding boxes, then resize each region to a uniform size. | FP32 | FP16 | - |
-| CumSum | Cumulative sum of elements | FP32
Int32 | - | - |
-| CustomExtractFeatures | Extract operators based on custom feature | FP32 | - | - |
-| CustomNormalize | Custom normalized operator | FP32 | - | - |
-| CustomPredict | Custom prediction operator | FP32
Int32 | - | - |
-| DEConv2DGradFilter | Compute the gradient of the transposed convolution with respect to the convolution kernel. | FP32 | - | |
-| DepthToSpace | Rearrange deep data into spatial dimensions | FP16
FP32
Int8
UInt8 | - | FP16
FP32 |
-| DetectionPostProcess | Post-processing of object detection | FP32
Int8
UInt8 | - | - |
-| DivFusion | Element-wise division | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| DivGrad | Compute the gradient of the division operation | FP32 | - | - |
-| Dropout | Randomly set some elements of the input tensor to zero. | FP16
FP32 | - | - |
-| DropoutGrad | Compute the gradient of the Dropout operation | FP16
FP32 | - | - |
-| DynamicQuant | Dynamically quantize floating-point tensors to uint8 type | FP32 | - | - |
-| Eltwise | Element-level operations | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Elu | Activation function, applying exponential correction to negative inputs | FP16
FP32 | - | - |
-| Equal | Determine whether inputs are equal | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| EmbeddingLookupFusion | Optimized word embedding lookup, mapping integer indices to dense vectors | FP32 | - | - |
-| Erf | Error functions | FP16
FP32 | - | - |
-| ExpFusion | Element-wise exponentiation | FP16
FP32 | - | FP16
FP32 |
-| ExpandDims | Insert a dimension of length 1 at the specified position | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 |
-| Fill | Generate a tensor filled with the specified constant. | FP16
FP32
Int32
Bool | - | FP16
FP32 |
-| Flatten | Data is expanded by dimension | FP16
FP32
Int32 | - | - |
-| FlattenGrad | Compute the gradient of the Flatten operation | FP16
FP32 | - | - |
-| Floor | Round down to the nearest integer | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| FloorDiv | Element-wise division down to the nearest integer | FP16
FP32
Int32 | FP16 | FP16
FP32 |
-| FloorMod | Element-wise modulo operation: the sign of the result matches that of the divisor. | FP16
FP32
Int32 | FP16 | FP16
FP32 |
-| FullConnection | Fully-connected layer | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| FusedBatchNorm | Standardize the input | FP16
FP32
Int8
UInt8 | FP16 | - |
-| GatherNd | Collect elements from the input tensor at specified positions based on the index tensor. | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32 |
-| Gather | Collect elements at specified index positions along a single dimension | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 |
-| GatherD | Collect elements from the input tensor based on the index tensor. | FP16
FP32
Int32
Bool | - | - |
-| GLU | Gated linear unit activation function splits the input into two parts and performs element-wise multiplication. | FP32 | - | - |
-| Greater | Perform element-wise comparison between two tensors, returning a logical result (True/False) indicating whether A > B. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| GreaterEqual | Perform element-wise comparison between two tensors, returning a logical result (True/False) indicating whether A ≥ B. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| GroupNormFusion | Group normalization for fusion optimization | FP32 | - | - |
-| GRU | Gated recurrent unit, simplified LSTM | FP16
FP32 | - | - |
-| HashtableLookup | Hash table lookup | FP32
Int32 | - | - |
-| InstanceNorm | Instance normalization | FP16
FP32 | FP16 | - |
-| InvertPermutation | Inverted replacement index | FP16
FP32
Int32 | - | - |
-| IsFinite | Check whether each element in the tensor is finite (not inf/NaN) | FP32 | - | - |
-| L2NormalizeFusion | L2 normalization for fusion optimization | FP32
Int8
UInt8 | - | - |
-| LayerNormFusion | Layer normalization for fusion optimization | FP16
FP32
Int8 | - | FP16
FP32 |
-| LayerNormGrad | Compute layer normalization gradients | FP16
FP32 | - | - |
-| LeakyReLU | Leaky ReLU activation function, which assigns a small slope to negative inputs. | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Less | Perform element-wise comparison between two tensors, returning a logical result indicating whether A < B. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| LessEqual | Perform element-wise comparison: A ≤ B, returns a Boolean tensor | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| LRN | Local response normalization | FP32 | - | - |
-| Log | Element-wise calculate the logarithm | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Log1p | Calculate log(1+X) | FP32 | - | - |
-| LogGrad | Calculate the gradient of the logarithmic function | FP16
FP32 | - | - |
-| LogicalAnd | Element-wise logical AND operation | FP16
FP32
Int32
Bool | FP16 | FP16
FP32 |
-| LogicalNot | Element-level logical NOT operation | FP16
FP32
Int8
UInt8
Bool | FP16 | FP16
FP32 |
-| LogicalOr | Element-wise logical OR operation | FP16
FP32
Bool | FP16 | FP16
FP32 |
-| LogSoftmax | Perform a softmax operation on the input vector, then take the logarithm of the softmax result. | FP16
FP32 | - | - |
-| LshProjection | Locality-sensitive hash projection | FP32 | - | - |
-| LSTM | Long-term and short-term memory network unit | FP16
FP32 | - | - |
-| LSTMGrad | Calculate the backward propagation gradient of the LSTM for the hidden state | FP32 | - | - |
-| LSTMGradData | Compute the backpropagation gradient of the LSTM for the input data | FP32 | - | - |
-| LSTMGradWeight | Calculate the backward propagation gradient of weights for the LSTM | FP32 | - | - |
-| MatMulFusion | Perform matrix multiplication on two inputs | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Maximum | Find the maximum value at the element level | FP16
FP32
Int32 | FP16 | FP16
FP32 |
-| MaximumGrad | Calculate the gradient of the maximum value function | FP16
FP32 | - | - |
-| MaxPoolFusion | Maximum pooling | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| MaxPoolGrad | Compute the gradients for the max-pooling layer | FP16
FP32 | - | - |
-| Merge | Create a new tensor with the exact same shape as the input tensor X, but with all element values set to 1. | FP16
FP32 | - | - |
-| Minimum | Find the minimum value at the element level | FP16
FP32
Int32 | FP16 | FP16
FP32 |
-| MinimumGrad | Compute the gradient of the minimum value function | FP16
FP32 | - | - |
-| Mod | Return the remainder of the division operation | FP32
Int32 | - | - |
-| MulFusion | Element-wise multiplication | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| MulGrad | Compute the gradient of the multiplication operation | FP32 | - | - |
-| Neg | Element-wise find negative numbers | FP16
FP32
Int32 | FP16 | FP16
FP32 |
-| NegGrad | Compute the gradient of the negation operation | FP16
FP32 | - | - |
-| NLLLoss | Compute the negative log-likelihood loss | FP32 | - | - |
-| NLLLossGrad | Compute the gradient of NLLLoss | FP32 | - | - |
-| NotEqual | Performs element-wise comparison between two tensors and returns the logical result indicating whether A != B. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| NonMaxSuppression | Non-maximum suppression | FP32 | - | - |
-| NonZero | Return the indices of all non-zero elements in the input tensor. | Bool | - | - |
-| OneHot | Convert integer index tensors to one-hot encoding representations | FP16
FP32
Int32 | - | FP16
FP32
Int32 |
-| OnesLike | Create a new tensor with the exact same shape as the input tensor X, but with all element values set to 1. | FP16
FP32
Int32 | - | - |
-| PadFusion | Add specified padding to the input tensor, to achieve the desired size. | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| PartialFusion | Partial fusion | FP16
FP32
Int32
Bool | - | - |
-| PowFusion | Element-wise exponentiation | FP16
FP32
Int8
UInt8 | - | FP16
FP32 |
-| PowerGrad | Compute the gradient of the power operation | FP32 | - | - |
-| PriorBox | Generate prior boxes | FP32
Int8
UInt8 | - | - |
-| PReLUFusion | PRelu activation function | FP16
FP32 | - | FP16
FP32 |
-| QuantDTypeCast | Perform quantitative data type conversion | FP16
FP32
Int8
UInt8 | - | - |
-| RaggedRange | Generate sequences with non-uniform intervals | FP16
FP32
Int32 | - | - |
-| RandomNormal | Generate a tensor whose values are randomly sampled from a normal distribution | FP16
FP32 | - | - |
-| RandomStandardNormal | Generate a random tensor following a standard normal distribution | FP16
FP32 | - | - |
-| Range | Generate elements within a specified range | FP16
FP32
Int32 | - | - |
-| Rank | Return the number of dimensions in the input tensor | FP16
FP32 | - | - |
-| RealDiv | Element-wise division | FP16
FP32 | - | - |
-| Reciprocal | Return reciprocals | FP16
FP32
Int8 | FP16 | - |
-| ReduceFusion | Reduction operation | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32 |
-| ReduceScatter | Distributed operations: Input tensors are segmented and distributed across devices, with each device retaining only one segment of the results. | FP32 | - | - |
-| Reshape | Changing the shape of a tensor while keeping the total number of elements unchanged | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 |
-| Resize | Upsample or resize the input tensor | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| ResizeGrad | Compute the gradient for Resize | FP16
FP32 | - | - |
-| ReverseV2 | Reverse the tensor along the specified axis | FP32
Int32 | - | - |
-| ReverseSequence | Partially reverse the variable-length sequence of the input tensor. | FP32 | - | - |
-| ROIPooling | Regional interest pooling | FP32 | - | - |
-| Round | Round to the nearest whole number | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Rsqrt | Element-wise compute square roots and reciprocals for normalization. | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| RsqrtGrad | Calculate the gradient of the reciprocal of the square root | FP32 | - | - |
-| Select | Select elements from two tensors based on conditions | FP32
Bool | - | - |
-| Selu | Self-normalizing index linear unit activation function | - | - | - |
-| ScaleFusion | Fuse scaling operations with adjacent operators | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| ScatterNd | Scatter values from the input tensor to specified positions in the output tensor based on the index. | FP16
FP32
Int32 | - | - |
-| ScatterNdUpdate | Update the value of the input data using the given value and the input index. | FP16
FP32
Int32 | - | - |
-| SGD | Stochastic gradient descent optimizer | FP32 | - | - |
-| Shape | Obtain the tensor shape | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32 |
-| SigmoidCrossEntropyWithLogits | Combine Sigmoid activation and cross-entropy loss | FP32 | - | - |
-| SigmoidCrossEntropyWithLogitsGrad | Compute the gradient of the cross-entropy loss with sigmoid | FP32 | - | - |
-| Sin | Element-wise calculation of sine | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Size | Obtain tensor dimension size | FP16
FP32
Int32 | - | - |
-| SliceFusion | Tensor slicing operation | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| SkipGram | The core operation of the Skip-gram model, used for training word vectors | FP32 | - | - |
-| SmoothL1Loss | Smooth L1 Loss | FP32 | - | - |
-| SmoothL1LossGrad | Compute the gradient of the L1 loss | FP32 | - | - |
-| Softmax | Normalization operation | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| SoftmaxGrad | Calculate the gradient of Softmax | FP32 | - | - |
-| Softplus | Smooth ReLU variants | FP16
FP32 | - | - |
-| SpaceToBatch | Move the values of the height and width dimensions to the depth dimension. | FP16
FP32
Int8
UInt8 | - | FP16
FP32 |
-| SpaceToBatchND | Split spatial-dimensional data blocks into batch dimensions | FP16
FP32
Int8
UInt8 | - | FP16
FP32 |
-| SpaceToDepth | Reorganize spatial data into depth channels | FP16
FP32 | - | FP16
FP32 |
-| SparseToDense | Convert sparse representations to dense tensors | FP16
FP32
Int32 | - | FP16
FP32
Int32 |
-| SparseSoftmaxCrossEntropyWithLogits | Softmax cross-entropy for sparse labels | FP32 | - | - |
-| Splice | Connect multiple slices or ranges of the input tensor along the specified axis. | FP16
FP32 | - | - |
-| Split | Split the input tensor into multiple smaller output tensors along the specified axis. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| SplitWithOverlap | Overlapped split tensor | FP16
FP32 | - | - |
-| Sqrt | Element-wise take the square root | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| SqrtGrad | Calculate the gradient of the square root | FP32 | - | - |
-| Square | Element-wise square | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| SquaredDifference | Element-wise compute (A-B)² | FP16
FP32 | - | FP16
FP32 |
-| Squeeze | Remove dimension of size 1 | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32
Int32 |
-| StridedSlice | Tensor slicing | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| StridedSliceGrad | Compute the gradient of the slice operation | FP16
FP32 | - | - |
-| Stack | Stack multiple tensors along the new axis | FP16
FP32
Int32 | - | FP16
FP32 |
-| SubFusion | Element-wise subtraction | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| SubGrad | Calculate the gradient of subtraction | FP32 | - | - |
-| Switch | Select output branches based on Boolean conditions | FP16
FP32
Int32
Bool | - | - |
-| SwitchLayer | Select different subnetwork branches for execution within the model | FP16
FP32
Int32
Bool | - | - |
-| TensorListFromTensor | Convert a regular tensor into a list of tensors, splitting along the specified axis. | FP16
FP32
Int32 | - | - |
-| TensorListGetItem | Retrieve the tensor at the specified index position from the tensor list | FP16
FP32
Int32 | - | - |
-| TensorListReserve | Preallocate an empty array list, specifying the element data type and initial capacity. | FP16
FP32
Int32 | - | - |
-| TensorListSetItem | Insert a tensor into a specified position in a list of tensors | FP16
FP32
Int32 | - | - |
-| TensorListStack | Stack the list of tensors into a single regular tensor | FP16
FP32
Int32 | - | - |
-| TensorScatterAdd | Add the updated tensor values to the specified positions in the target tensor using the index. | FP32
Int32 | - | - |
-| TileFusion | Flatten the given matrix | FP16
FP32
Int32
Bool | FP16 | - |
-| TopKFusion | Return the top K elements from the input tensor. | FP16
FP32
Int32
Int8
UInt8 | - | - |
-| Transpose | Tensor transpose | FP16
FP32
Int32
Int8
Bool | FP16 | FP16
FP32 |
-| UniformReal | Generate a random tensor following a uniform distribution | FP32
Int32 | - | - |
-| Unique | Returns the unique values in the input tensor, along with their indices and count. | FP16
FP32
Int32 | - | - |
-| UnsortedSegmentSum | Perform segmented summation on the tensor without requiring ordered segmented indices. | FP16
FP32
Int32 | - | - |
-| Unsqueeze | Add a new dimension to the input tensor | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 |
-| Unstack | Split a tensor into multiple sub-tensors along a specified axis | FP16
FP32
Int32 | - | - |
-| Where | Element selection | FP16
FP32
Int32
Bool | - | - |
-| ZerosLike | Generate a new tensor with the same shape as the input tensor but with all elements set to zero. | FP16
FP32
Int32 | - | - |
+| Operator Names | Operator Functions | CPU | Kirin NPU | GPU (Mali/Adreno) | Ascend |
+| ----------------------------------- | ------------------------------------------------------------ | --------------------------------------------------- | --------- | ----------------------- | ----------------------- |
+| Abs | Element-wise calculate the absolute value | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| AbsGrad | Compute the gradient of the absolute value function | FP32 | - | - | |
+| Activation | Activation functions | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| ActivationGrad | Calculate the gradient of a specific activation function | FP16
FP32 | - | - | |
+| Adam | Executing a single parameter update step of the Adam optimizer | FP32 | - | - | |
+| AddFusion | Element-wise addition computation | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int8 | FP16 |
+| AdderFusion | Addition-based convolution operation | FP32 | - | - | |
+| AddGrad | Compute the gradient of the addition operation | FP32 | - | - | |
+| AddN | Perform element-wise addition on N input tensors of identical shape and data type. | FP16
FP32 | - | - | |
+| Affine | Perform an affine transformation on the input tensor. | FP32 | - | - | FP16 |
+| All | Determine whether all elements in the tensor are True (non-zero) along the specified dimension. | FP32 | - | - | |
+| AllGather | Distributed collection communication operations | FP32 | - | - | |
+| ApplyMomentum | Execute a single parameter update step of stochastic gradient descent for momentum. | FP32 | - | - | FP16 |
+| Assert | Assertion | FP16
FP32
Bool | - | - | |
+| Assign | Assign a value to a variable | FP32 | - | - | FP16 |
+| ArgmaxFusion | Find the maximum value in a given dimension | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| ArgminFusion | Find the minimum value in a given dimension | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 |
+| AvgPoolFusion | Average pooling | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| AvgPoolGrad | Compute the gradients for the average pooling layer | FP16
FP32 | - | - | |
+| BatchNorm | Batch normalization | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 |
+| BatchNormGrad | Compute the gradient of the batch normalization layer | FP16
FP32 | - | - | |
+| BatchToSpace | Inverse operation of space-to-batch transformation | FP32
Int8
UInt8 | - | FP16
FP32 | |
+| BatchToSpaceND | ND universal version of BatchToSpace | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | |
+| BiasAdd | Add the bias vector to the input tensor | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 |
+| BiasAddGrad | The gradient of the BiasAdd operation | FP16
FP32 | - | - | |
+| BinaryCrossEntropy | Calculate the binary cross-entropy loss | FP32 | - | - | FP16 |
+| BinaryCrossEntropyGrad | Calculate the gradient of the binary cross-entropy loss function | FP32 | - | - | |
+| BroadcastTo | Expansion of dimensions | FP16
FP32
Int32
Bool | - | - | |
+| Call | Call a subgraph or function | FP16
FP32
Int32
Bool | - | - | FP16 |
+| Cast | Data type conversion | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32 | FP16 |
+| Ceil | Round up to the nearest integer | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Clip | Restrict element ranges | FP32
Int32 | - | - | FP16 |
+| Concat | Concatenated Tensor | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | FP16 |
+| ConstantOfShape | Generate a tensor with the same shape as the input and fill it with the specified constant. | FP16
FP32
Int32 | - | - | |
+| Conv2DFusion | 2D convolution | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Conv2DBackpropFilterFusion | Compute the gradient of the convolution kernel with respect to the ordinary convolution operation. | FP16
FP32 | - | - | |
+| Conv2DBackpropInputFusion | Compute the gradient of the input data with respect to the standard convolution operation. | FP16
FP32 | - | - | |
+| Conv2dTransposeFusion | Perform transposed convolution operations | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Cos | Element-wise cosine calculation | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Crop | Crop a specified region from an input image or feature map. | FP16
FP32
Int32
Int8
UInt8 | - | - | |
+| CropAndResize | Crop regions from the input image based on a set of bounding boxes, then resize each region to a uniform size. | FP32 | FP16 | - | |
+| CumSum | Cumulative sum of elements | FP32
Int32 | - | - | FP16 |
+| CustomExtractFeatures | Extract operators based on custom feature | FP32 | - | - | |
+| CustomNormalize | Custom normalized operator | FP32 | - | - | |
+| CustomPredict | Custom prediction operator | FP32
Int32 | - | - | |
+| DEConv2DGradFilter | Compute the gradient of the transposed convolution with respect to the convolution kernel. | FP32 | - | | |
+| DepthToSpace | Rearrange deep data into spatial dimensions | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | |
+| DetectionPostProcess | Post-processing of object detection | FP32
Int8
UInt8 | - | - | |
+| DivFusion | Element-wise division | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| DivGrad | Compute the gradient of the division operation | FP32 | - | - | |
+| Dropout | Randomly set some elements of the input tensor to zero. | FP16
FP32 | - | - | FP16 |
+| DropoutGrad | Compute the gradient of the Dropout operation | FP16
FP32 | - | - | |
+| DynamicQuant | Dynamically quantize floating-point tensors to uint8 type | FP32 | - | - | |
+| Eltwise | Element-level operations | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Elu | Activation function, applying exponential correction to negative inputs | FP16
FP32 | - | - | FP16 |
+| Equal | Determine whether inputs are equal | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| EmbeddingLookupFusion | Optimized word embedding lookup, mapping integer indices to dense vectors | FP32 | - | - | |
+| Erf | Error functions | FP16
FP32 | - | - | FP16 |
+| ExpFusion | Element-wise exponentiation | FP16
FP32 | - | FP16
FP32 | FP16 |
+| ExpandDims | Insert a dimension of length 1 at the specified position | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | FP16 |
+| Fill | Generate a tensor filled with the specified constant. | FP16
FP32
Int32
Bool | - | FP16
FP32 | FP16 |
+| Flatten | Data is expanded by dimension | FP16
FP32
Int32 | - | - | FP16 |
+| FlattenGrad | Compute the gradient of the Flatten operation | FP16
FP32 | - | - | |
+| Floor | Round down to the nearest integer | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| FloorDiv | Element-wise division down to the nearest integer | FP16
FP32
Int32 | FP16 | FP16
FP32 | |
+| FloorMod | Element-wise modulo operation: the sign of the result matches that of the divisor. | FP16
FP32
Int32 | FP16 | FP16
FP32 | |
+| FullConnection | Fully-connected layer | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| FusedBatchNorm | Standardize the input | FP16
FP32
Int8
UInt8 | FP16 | - | FP16 |
+| GatherNd | Collect elements from the input tensor at specified positions based on the index tensor. | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32 | FP16 |
+| Gather | Collect elements at specified index positions along a single dimension | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | FP16 |
+| GatherD | Collect elements from the input tensor based on the index tensor. | FP16
FP32
Int32
Bool | - | - | FP16 |
+| GLU | Gated linear unit activation function splits the input into two parts and performs element-wise multiplication. | FP32 | - | - | |
+| Greater | Perform element-wise comparison between two tensors, returning a logical result (True/False) indicating whether A > B. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| GreaterEqual | Perform element-wise comparison between two tensors, returning a logical result (True/False) indicating whether A ≥ B. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| GroupNormFusion | Group normalization for fusion optimization | FP32 | - | - | |
+| GRU | Gated recurrent unit, simplified LSTM | FP16
FP32 | - | - | |
+| HashtableLookup | Hash table lookup | FP32
Int32 | - | - | |
+| InstanceNorm | Instance normalization | FP16
FP32 | FP16 | - | FP16 |
+| InvertPermutation | Inverted replacement index | FP16
FP32
Int32 | - | - | |
+| IsFinite | Check whether each element in the tensor is finite (not inf/NaN) | FP32 | - | - | FP16 |
+| L2NormalizeFusion | L2 normalization for fusion optimization | FP32
Int8
UInt8 | - | - | |
+| LayerNormFusion | Layer normalization for fusion optimization | FP16
FP32
Int8 | - | FP16
FP32 | FP16 |
+| LayerNormGrad | Compute layer normalization gradients | FP16
FP32 | - | - | |
+| LeakyReLU | Leaky ReLU activation function, which assigns a small slope to negative inputs. | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Less | Perform element-wise comparison between two tensors, returning a logical result indicating whether A < B. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| LessEqual | Perform element-wise comparison: A ≤ B, returns a Boolean tensor | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| LRN | Local response normalization | FP32 | - | - | FP16 |
+| Log | Element-wise calculate the logarithm | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Log1p | Calculate log(1+X) | FP32 | - | - | FP16 |
+| LogGrad | Calculate the gradient of the logarithmic function | FP16
FP32 | - | - | |
+| LogicalAnd | Element-wise logical AND operation | FP16
FP32
Int32
Bool | FP16 | FP16
FP32 | |
+| LogicalNot | Element-level logical NOT operation | FP16
FP32
Int8
UInt8
Bool | FP16 | FP16
FP32 | |
+| LogicalOr | Element-wise logical OR operation | FP16
FP32
Bool | FP16 | FP16
FP32 | |
+| LogSoftmax | Perform a softmax operation on the input vector, then take the logarithm of the softmax result. | FP16
FP32 | - | - | FP16 |
+| LshProjection | Locality-sensitive hash projection | FP32 | - | - | |
+| LSTM | Long-term and short-term memory network unit | FP16
FP32 | - | - | |
+| LSTMGrad | Calculate the backward propagation gradient of the LSTM for the hidden state | FP32 | - | - | |
+| LSTMGradData | Compute the backpropagation gradient of the LSTM for the input data | FP32 | - | - | |
+| LSTMGradWeight | Calculate the backward propagation gradient of weights for the LSTM | FP32 | - | - | |
+| MatMulFusion | Perform matrix multiplication on two inputs | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Maximum | Find the maximum value at the element level | FP16
FP32
Int32 | FP16 | FP16
FP32 | FP16 |
+| MaximumGrad | Calculate the gradient of the maximum value function | FP16
FP32 | - | - | |
+| MaxPoolFusion | Maximum pooling | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| MaxPoolGrad | Compute the gradients for the max-pooling layer | FP16
FP32 | - | - | |
+| Merge | Create a new tensor with the exact same shape as the input tensor X, but with all element values set to 1. | FP16
FP32 | - | - | |
+| Minimum | Find the minimum value at the element level | FP16
FP32
Int32 | FP16 | FP16
FP32 | FP16 |
+| MinimumGrad | Compute the gradient of the minimum value function | FP16
FP32 | - | - | |
+| Mod | Return the remainder of the division operation | FP32
Int32 | - | - | FP16 |
+| MulFusion | Element-wise multiplication | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| MulGrad | Compute the gradient of the multiplication operation | FP32 | - | - | |
+| Neg | Element-wise find negative numbers | FP16
FP32
Int32 | FP16 | FP16
FP32 | FP16 |
+| NegGrad | Compute the gradient of the negation operation | FP16
FP32 | - | - | |
+| NLLLoss | Compute the negative log-likelihood loss | FP32 | - | - | FP16 |
+| NLLLossGrad | Compute the gradient of NLLLoss | FP32 | - | - | |
+| NotEqual | Performs element-wise comparison between two tensors and returns the logical result indicating whether A != B. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | |
+| NonMaxSuppression | Non-maximum suppression | FP32 | - | - | FP16 |
+| NonZero | Return the indices of all non-zero elements in the input tensor. | Bool | - | - | FP16 |
+| OneHot | Convert integer index tensors to one-hot encoding representations | FP16
FP32
Int32 | - | FP16
FP32
Int32 | |
+| OnesLike | Create a new tensor with the exact same shape as the input tensor X, but with all element values set to 1. | FP16
FP32
Int32 | - | - | FP16 |
+| PadFusion | Add specified padding to the input tensor, to achieve the desired size. | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| PartialFusion | Partial fusion | FP16
FP32
Int32
Bool | - | - | |
+| PowFusion | Element-wise exponentiation | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 |
+| PowerGrad | Compute the gradient of the power operation | FP32 | - | - | |
+| PriorBox | Generate prior boxes | FP32
Int8
UInt8 | - | - | FP16 |
+| PReLUFusion | PRelu activation function | FP16
FP32 | - | FP16
FP32 | FP16 |
+| QuantDTypeCast | Perform quantitative data type conversion | FP16
FP32
Int8
UInt8 | - | - | |
+| RaggedRange | Generate sequences with non-uniform intervals | FP16
FP32
Int32 | - | - | |
+| RandomNormal | Generate a tensor whose values are randomly sampled from a normal distribution | FP16
FP32 | - | - | |
+| RandomStandardNormal | Generate a random tensor following a standard normal distribution | FP16
FP32 | - | - | |
+| Range | Generate elements within a specified range | FP16
FP32
Int32 | - | - | FP16 |
+| Rank | Return the number of dimensions in the input tensor | FP16
FP32 | - | - | |
+| RealDiv | Element-wise division | FP16
FP32 | - | - | FP16 |
+| Reciprocal | Return reciprocals | FP16
FP32
Int8 | FP16 | - | FP16 |
+| ReduceFusion | Reduction operation | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32 | FP16 |
+| ReduceScatter | Distributed operations: Input tensors are segmented and distributed across devices, with each device retaining only one segment of the results. | FP32 | - | - | |
+| Reshape | Changing the shape of a tensor while keeping the total number of elements unchanged | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | FP16 |
+| Resize | Upsample or resize the input tensor | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | |
+| ResizeGrad | Compute the gradient for Resize | FP16
FP32 | - | - | |
+| ReverseV2 | Reverse the tensor along the specified axis | FP32
Int32 | - | - | |
+| ReverseSequence | Partially reverse the variable-length sequence of the input tensor. | FP32 | - | - | FP16 |
+| ROIPooling | Regional interest pooling | FP32 | - | - | FP16 |
+| Round | Round to the nearest whole number | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Rsqrt | Element-wise compute square roots and reciprocals for normalization. | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | |
+| RsqrtGrad | Calculate the gradient of the reciprocal of the square root | FP32 | - | - | |
+| Select | Select elements from two tensors based on conditions | FP32
Bool | - | - | |
+| Selu | Self-normalizing index linear unit activation function | - | - | - | |
+| ScaleFusion | Fuse scaling operations with adjacent operators | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| ScatterNd | Scatter values from the input tensor to specified positions in the output tensor based on the index. | FP16
FP32
Int32 | - | - | FP16 |
+| ScatterNdUpdate | Update the value of the input data using the given value and the input index. | FP16
FP32
Int32 | - | - | |
+| SGD | Stochastic gradient descent optimizer | FP32 | - | - | FP16 |
+| Shape | Obtain the tensor shape | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32 | FP16 |
+| SigmoidCrossEntropyWithLogits | Combine Sigmoid activation and cross-entropy loss | FP32 | - | - | FP16 |
+| SigmoidCrossEntropyWithLogitsGrad | Compute the gradient of the cross-entropy loss with sigmoid | FP32 | - | - | FP16 |
+| Sin | Element-wise calculation of sine | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Size | Obtain tensor dimension size | FP16
FP32
Int32 | - | - | FP16 |
+| SliceFusion | Tensor slicing operation | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| SkipGram | The core operation of the Skip-gram model, used for training word vectors | FP32 | - | - | |
+| SmoothL1Loss | Smooth L1 Loss | FP32 | - | - | FP16 |
+| SmoothL1LossGrad | Compute the gradient of the L1 loss | FP32 | - | - | |
+| Softmax | Normalization operation | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| SoftmaxGrad | Calculate the gradient of Softmax | FP32 | - | - | |
+| Softplus | Smooth ReLU variants | FP16
FP32 | - | - | FP16 |
+| SpaceToBatch | Move the values of the height and width dimensions to the depth dimension. | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 |
+| SpaceToBatchND | Split spatial-dimensional data blocks into batch dimensions | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | |
+| SpaceToDepth | Reorganize spatial data into depth channels | FP16
FP32 | - | FP16
FP32 | |
+| SparseToDense | Convert sparse representations to dense tensors | FP16
FP32
Int32 | - | FP16
FP32
Int32 | |
+| SparseSoftmaxCrossEntropyWithLogits | Softmax cross-entropy for sparse labels | FP32 | - | - | FP16 |
+| Splice | Connect multiple slices or ranges of the input tensor along the specified axis. | FP16
FP32 | - | - | |
+| Split | Split the input tensor into multiple smaller output tensors along the specified axis. | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| SplitWithOverlap | Overlapped split tensor | FP16
FP32 | - | - | |
+| Sqrt | Element-wise take the square root | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| SqrtGrad | Calculate the gradient of the square root | FP32 | - | - | |
+| Square | Element-wise square | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| SquaredDifference | Element-wise compute (A-B)² | FP16
FP32 | - | FP16
FP32 | |
+| Squeeze | Remove dimension of size 1 | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32
Int32 | |
+| StridedSlice | Tensor slicing | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| StridedSliceGrad | Compute the gradient of the slice operation | FP16
FP32 | - | - | |
+| Stack | Stack multiple tensors along the new axis | FP16
FP32
Int32 | - | FP16
FP32 | FP16 |
+| SubFusion | Element-wise subtraction | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| SubGrad | Calculate the gradient of subtraction | FP32 | - | - | |
+| Switch | Select output branches based on Boolean conditions | FP16
FP32
Int32
Bool | - | - | |
+| SwitchLayer | Select different subnetwork branches for execution within the model | FP16
FP32
Int32
Bool | - | - | |
+| TensorListFromTensor | Convert a regular tensor into a list of tensors, splitting along the specified axis. | FP16
FP32
Int32 | - | - | |
+| TensorListGetItem | Retrieve the tensor at the specified index position from the tensor list | FP16
FP32
Int32 | - | - | |
+| TensorListReserve | Preallocate an empty array list, specifying the element data type and initial capacity. | FP16
FP32
Int32 | - | - | |
+| TensorListSetItem | Insert a tensor into a specified position in a list of tensors | FP16
FP32
Int32 | - | - | |
+| TensorListStack | Stack the list of tensors into a single regular tensor | FP16
FP32
Int32 | - | - | |
+| TensorScatterAdd | Add the updated tensor values to the specified positions in the target tensor using the index. | FP32
Int32 | - | - | |
+| TileFusion | Flatten the given matrix | FP16
FP32
Int32
Bool | FP16 | - | FP16 |
+| TopKFusion | Return the top K elements from the input tensor. | FP16
FP32
Int32
Int8
UInt8 | - | - | FP16 |
+| Transpose | Tensor transpose | FP16
FP32
Int32
Int8
Bool | FP16 | FP16
FP32 | FP16 |
+| UniformReal | Generate a random tensor following a uniform distribution | FP32
Int32 | - | - | |
+| Unique | Returns the unique values in the input tensor, along with their indices and count. | FP16
FP32
Int32 | - | - | |
+| UnsortedSegmentSum | Perform segmented summation on the tensor without requiring ordered segmented indices. | FP16
FP32
Int32 | - | - | |
+| Unsqueeze | Add a new dimension to the input tensor | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | |
+| Unstack | Split a tensor into multiple sub-tensors along a specified axis | FP16
FP32
Int32 | - | - | |
+| Where | Element selection | FP16
FP32
Int32
Bool | - | - | |
+| ZerosLike | Generate a new tensor with the same shape as the input tensor but with all elements set to zero. | FP16
FP32
Int32 | - | - | |
diff --git a/docs/lite/docs/source_zh_cn/reference/operator_list_lite.md b/docs/lite/docs/source_zh_cn/reference/operator_list_lite.md
index d984a13f4309d545e4e22ecc6be3bd9c1ad2babc..933c3beb4f7189c9bc113011f4add0a64b80f5e0 100644
--- a/docs/lite/docs/source_zh_cn/reference/operator_list_lite.md
+++ b/docs/lite/docs/source_zh_cn/reference/operator_list_lite.md
@@ -2,203 +2,203 @@
[](https://gitee.com/mindspore/docs/blob/master/docs/lite/docs/source_zh_cn/reference/operator_list_lite.md)
-| 算子名称 | 算子功能 | CPU | Kirin NPU | GPU(Mali/Adreno) |
-| ----------------------------------- | ------------------------------------------------------------ | --------------------------------------------------- | --------- | ----------------------- |
-| Abs | 逐元素计算绝对值 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| AbsGrad | 计算绝对值函数的梯度 | FP32 | - | - |
-| Activation | 激活函数 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| ActivationGrad | 计算特定激活函数的梯度 | FP16
FP32 | - | - |
-| Adam | 执行Adam优化器的一次参数更新步骤 | FP32 | - | - |
-| AddFusion | 逐元素计算加法 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int8 |
-| AdderFusion | 基于加法的卷积运算 | FP32 | - | - |
-| AddGrad | 计算加法操作的梯度 | FP32 | - | - |
-| AddN | 对N个相同形状和数据类型的输入张量进行逐元素相加 | FP16
FP32 | - | - |
-| Affine | 对输入张量执行仿射变换 | FP32 | - | - |
-| All | 判断张量中所有元素在指定维度上是否都为True(非零) | FP32 | - | - |
-| AllGather | 分布式集合通信操作 | FP32 | - | - |
-| ApplyMomentum | 执行带动量的随机梯度下降的一次参数更新步骤 | FP32 | - | - |
-| Assert | 断言 | FP16
FP32
Bool | - | - |
-| Assign | 将一个值赋值给一个变量 | FP32 | - | - |
-| ArgmaxFusion | 求某一维度最大值 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| ArgminFusion | 求某一维度最小值 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 |
-| AvgPoolFusion | 平均池化 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| AvgPoolGrad | 计算平均池化层的梯度 | FP16
FP32 | - | - |
-| BatchNorm | 批量归一化 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 |
-| BatchNormGrad | 计算批量归一化层的梯度 | FP16
FP32 | - | - |
-| BatchToSpace | 空间到批次变换的逆操作 | FP32
Int8
UInt8 | - | FP16
FP32 |
-| BatchToSpaceND | BatchToSpace的ND通用版本 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 |
-| BiasAdd | 将偏置向量添加到输入张量 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 |
-| BiasAddGrad | 计算BiasAdd操作的梯度 | FP16
FP32 | - | - |
-| BinaryCrossEntropy | 计算二元交叉熵损失 | FP32 | - | - |
-| BinaryCrossEntropyGrad | 计算二元交叉熵损失函数的梯度 | FP32 | - | - |
-| BroadcastTo | 扩维 | FP16
FP32
Int32
Bool | - | - |
-| Call | 调用一个子计算图或函数 | FP16
FP32
Int32
Bool | - | - |
-| Cast | 数据类型转换 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32 |
-| Ceil | 向上取整 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Clip | 限制元素范围 | FP32
Int32 | - | - |
-| Concat | 拼接张量 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 |
-| ConstantOfShape | 生成一个与输入形状相同的张量,并用指定常量填充 | FP16
FP32
Int32 | - | - |
-| Conv2DFusion | 2D卷积 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Conv2DBackpropFilterFusion | 计算普通卷积操作对卷积核的梯度 | FP16
FP32 | - | - |
-| Conv2DBackpropInputFusion | 计算普通卷积操作对输入数据的梯度 | FP16
FP32 | - | - |
-| Conv2dTransposeFusion | 执行转置卷积运算 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Cos | 逐元素计算余弦 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Crop | 从输入图像或特征图中裁剪出一个指定区域 | FP16
FP32
Int32
Int8
UInt8 | - | - |
-| CropAndResize | 从输入图像中根据一组边界框裁剪出区域,然后将每个区域缩放到统一大小 | FP32 | FP16 | - |
-| CumSum | 累计元素和 | FP32
Int32 | - | - |
-| CustomExtractFeatures | 自定义特征提取算子 | FP32 | - | - |
-| CustomNormalize | 自定义归一化算子 | FP32 | - | - |
-| CustomPredict | 自定义预测算子 | FP32
Int32 | - | - |
-| DEConv2DGradFilter | 计算转置卷积对卷积核的梯度 | FP32 | - | |
-| DepthToSpace | 将深度数据重新排列到空间维度中 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 |
-| DetectionPostProcess | 目标检测后处理 | FP32
Int8
UInt8 | - | - |
-| DivFusion | 逐元素除法 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| DivGrad | 计算除法操作的梯度 | FP32 | - | - |
-| Dropout | 随机将输入张量的部分元素置 0 | FP16
FP32 | - | - |
-| DropoutGrad | 计算Dropout操作的梯度 | FP16
FP32 | - | - |
-| DynamicQuant | 动态将浮点张量量化为uint8类型 | FP32 | - | - |
-| Eltwise | 元素级运算 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Elu | 激活函数,对负输入使用指数修正 | FP16
FP32 | - | - |
-| Equal | 判断输入是否相等 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| EmbeddingLookupFusion | 优化版的词嵌入查找,将整数索引映射为密集向量 | FP32 | - | - |
-| Erf | 误差函数 | FP16
FP32 | - | - |
-| ExpFusion | 逐元素取指数 | FP16
FP32 | - | FP16
FP32 |
-| ExpandDims | 在指定位置插入长度为1的维度 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 |
-| Fill | 生成一个填充指定常量的张量 | FP16
FP32
Int32
Bool | - | FP16
FP32 |
-| Flatten | 数据按维度展开 | FP16
FP32
Int32 | - | - |
-| FlattenGrad | 计算Flatten操作的梯度 | FP16
FP32 | - | - |
-| Floor | 向下取整 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| FloorDiv | 逐元素向下取整除法 | FP16
FP32
Int32 | FP16 | FP16
FP32 |
-| FloorMod | 逐元素取模运算,结果的符号与除数一致 | FP16
FP32
Int32 | FP16 | FP16
FP32 |
-| FullConnection | 全连接层 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| FusedBatchNorm | 对输入做标准化 | FP16
FP32
Int8
UInt8 | FP16 | - |
-| GatherNd | 根据索引张量从输入张量中收集指定位置的元素 | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32 |
-| Gather | 沿单一维度收集指定索引位置的元素 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 |
-| GatherD | 将输入tensor中的元素根据索引tensor进行收集 | FP16
FP32
Int32
Bool | - | - |
-| GLU | 门控线性单元激活函数,将输入拆分为两部分并逐元素相乘 | FP32 | - | - |
-| Greater | 逐元素比较两个张量,返回A>B的逻辑结果(True/False) | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| GreaterEqual | 逐元素比较两个张量,返回 A≥B的逻辑结果(True/False) | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| GroupNormFusion | 融合优化的组归一化 | FP32 | - | - |
-| GRU | 门控循环单元,简化版LSTM | FP16
FP32 | - | - |
-| HashtableLookup | 哈希表查找 | FP32
Int32 | - | - |
-| InstanceNorm | 实例归一化 | FP16
FP32 | FP16 | - |
-| InvertPermutation | 反转置换索引 | FP16
FP32
Int32 | - | - |
-| IsFinite | 检测张量中每个元素是否为有限值(非inf/NaN) | FP32 | - | - |
-| L2NormalizeFusion | 融合优化的L2归一化 | FP32
Int8
UInt8 | - | - |
-| LayerNormFusion | 融合优化的层归一化 | FP16
FP32
Int8 | - | FP16
FP32 |
-| LayerNormGrad | 计算层归一化的梯度 | FP16
FP32 | - | - |
-| LeakyReLU | 带泄漏的 ReLU激活函数,对负输入给予微小斜率 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Less | 逐元素比较两个张量,返回 AFP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| LessEqual | 逐元素比较A ≤ B,返回布尔张量 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| LRN | 局部响应归一化 | FP32 | - | - |
-| Log | 逐元素求对数 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Log1p | 计算log(1+X) | FP32 | - | - |
-| LogGrad | 计算对数函数的梯度 | FP16
FP32 | - | - |
-| LogicalAnd | 逐元素逻辑与运算 | FP16
FP32
Int32
Bool | FP16 | FP16
FP32 |
-| LogicalNot | 逐元素逻辑非运算 | FP16
FP32
Int8
UInt8
Bool | FP16 | FP16
FP32 |
-| LogicalOr | 逐元素逻辑或运算 | FP16
FP32
Bool | FP16 | FP16
FP32 |
-| LogSoftmax | 对输入向量进行softmax操作,然后再对softmax结果取对数 | FP16
FP32 | - | - |
-| LshProjection | 局部敏感哈希投影 | FP32 | - | - |
-| LSTM | 长短期记忆网络单元 | FP16
FP32 | - | - |
-| LSTMGrad | 计算LSTM对隐状态的反向传播梯度 | FP32 | - | - |
-| LSTMGradData | 计算LSTM对输入数据的反向传播梯度 | FP32 | - | - |
-| LSTMGradWeight | 计算LSTM对权重的反向传播梯度 | FP32 | - | - |
-| MatMulFusion | 对2个输入做矩阵乘法运算 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Maximum | 取元素级最大值 | FP16
FP32
Int32 | FP16 | FP16
FP32 |
-| MaximumGrad | 计算最大值函数的梯度 | FP16
FP32 | - | - |
-| MaxPoolFusion | 最大池化 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| MaxPoolGrad | 计算最大池化层的梯度 | FP16
FP32 | - | - |
-| Merge | 创建一个与输入张量X形状完全相同但所有元素值均为1的新张量 | FP16
FP32 | - | - |
-| Minimum | 取元素级最小值 | FP16
FP32
Int32 | FP16 | FP16
FP32 |
-| MinimumGrad | 计算最小值函数的梯度 | FP16
FP32 | - | - |
-| Mod | 返回除法元素的余数 | FP32
Int32 | - | - |
-| MulFusion | 逐元素乘法 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| MulGrad | 计算乘法操作的梯度 | FP32 | - | - |
-| Neg | 逐元素求负数 | FP16
FP32
Int32 | FP16 | FP16
FP32 |
-| NegGrad | 计算取负操作的梯度 | FP16
FP32 | - | - |
-| NLLLoss | 计算负对数似然损失 | FP32 | - | - |
-| NLLLossGrad | 计算NLLLoss的梯度 | FP32 | - | - |
-| NotEqual | 逐元素比较两个张量,返回 A != B的逻辑结果 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| NonMaxSuppression | 非极大值抑制 | FP32 | - | - |
-| NonZero | 返回输入张量中所有非零元素的索引 | Bool | - | - |
-| OneHot | 将整数索引张量转换为独热编码表示 | FP16
FP32
Int32 | - | FP16
FP32
Int32 |
-| OnesLike | 创建一个与输入张量 X形状完全相同但所有元素值均为1的新张量 | FP16
FP32
Int32 | - | - |
-| PadFusion | 将输入张量加上指定的padding,使其达到指定的大小 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| PartialFusion | 部分融合 | FP16
FP32
Int32
Bool | - | - |
-| PowFusion | 逐元素求幂 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 |
-| PowerGrad | 计算幂运算的梯度 | FP32 | - | - |
-| PriorBox | 生成先验框 | FP32
Int8
UInt8 | - | - |
-| PReLUFusion | PRelu激活函数 | FP16
FP32 | - | FP16
FP32 |
-| QuantDTypeCast | 执行量化数据类型转换 | FP16
FP32
Int8
UInt8 | - | - |
-| RaggedRange | 生成非均匀间隔的序列 | FP16
FP32
Int32 | - | - |
-| RandomNormal | 生成一个张量,其中的值从正态分布中随机采样 | FP16
FP32 | - | - |
-| RandomStandardNormal | 生成服从标准正态分布的随机数张量 | FP16
FP32 | - | - |
-| Range | 生成某个区间内的元素 | FP16
FP32
Int32 | - | - |
-| Rank | 返回输入张量的维度数 | FP16
FP32 | - | - |
-| RealDiv | 逐元素除法 | FP16
FP32 | - | - |
-| Reciprocal | 返回倒数 | FP16
FP32
Int8 | FP16 | - |
-| ReduceFusion | 归约操作 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32 |
-| ReduceScatter | 分布式操作,将输入张量分段后分发到各设备,每设备仅保留一段结果 | FP32 | - | - |
-| Reshape | 改变张量形状,总元素个数不变 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 |
-| Resize | 对输入张量进行上采样或调整大小 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| ResizeGrad | 计算Resize的梯度 | FP16
FP32 | - | - |
-| ReverseV2 | 沿指定轴反转张量 | FP32
Int32 | - | - |
-| ReverseSequence | 对输入张量的可变长度序列进行部分反转 | FP32 | - | - |
-| ROIPooling | 区域兴趣池化 | FP32 | - | - |
-| Round | 四舍五入到最接近的整数数值 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Rsqrt | 逐元素计算平方根倒数,用于归一化 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| RsqrtGrad | 计算平方根倒数的梯度 | FP32 | - | - |
-| Select | 根据条件从两个张量中选择元素 | FP32
Bool | - | - |
-| Selu | 自归一化指数线性单元激活函数 | - | - | - |
-| ScaleFusion | 将缩放操作与相邻算子融合 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| ScatterNd | 根据索引将更新张量中的值散射到输出张量的指定位置 | FP16
FP32
Int32 | - | - |
-| ScatterNdUpdate | 使用给定值以及输入索引更新输入数据的值 | FP16
FP32
Int32 | - | - |
-| SGD | 随机梯度下降优化器 | FP32 | - | - |
-| Shape | 获得张量shape | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32 |
-| SigmoidCrossEntropyWithLogits | 结合Sigmoid激活和交叉熵损失 | FP32 | - | - |
-| SigmoidCrossEntropyWithLogitsGrad | 计算带Sigmoid的交叉熵损失的梯度 | FP32 | - | - |
-| Sin | 逐元素计算正弦 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| Size | 获取张量维度大小 | FP16
FP32
Int32 | - | - |
-| SliceFusion | 张量切片操作 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| SkipGram | Skip-gram模型的核心操作,用于词向量训练 | FP32 | - | - |
-| SmoothL1Loss | 平滑L1损失 | FP32 | - | - |
-| SmoothL1LossGrad | 计算平滑L1损失的梯度 | FP32 | - | - |
-| Softmax | 归一化操作 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| SoftmaxGrad | 计算Softmax的梯度 | FP32 | - | - |
-| Softplus | 平滑的ReLU变体 | FP16
FP32 | - | - |
-| SpaceToBatch | 高度和宽度维度的值移至深度维度 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 |
-| SpaceToBatchND | 将空间维度的数据块拆分到批次维度 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 |
-| SpaceToDepth | 将空间数据重组为深度通道 | FP16
FP32 | - | FP16
FP32 |
-| SparseToDense | 将稀疏表示转换为密集张量 | FP16
FP32
Int32 | - | FP16
FP32
Int32 |
-| SparseSoftmaxCrossEntropyWithLogits | 稀疏标签的Softmax交叉熵 | FP32 | - | - |
-| Splice | 沿指定轴连接输入张量的多个切片或范围 | FP16
FP32 | - | - |
-| Split | 将输入张量沿指定轴分割成多个较小的输出张量 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| SplitWithOverlap | 带重叠的分割张量 | FP16
FP32 | - | - |
-| Sqrt | 逐元素开根号 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| SqrtGrad | 计算平方根的梯度 | FP32 | - | - |
-| Square | 逐元素平方 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 |
-| SquaredDifference | 逐元素计算 (A-B)² | FP16
FP32 | - | FP16
FP32 |
-| Squeeze | 移除维度为1的维度 | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32
Int32 |
-| StridedSlice | Tensor切片 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| StridedSliceGrad | 计算切片操作的梯度 | FP16
FP32 | - | - |
-| Stack | 沿新轴堆叠多个张量 | FP16
FP32
Int32 | - | FP16
FP32 |
-| SubFusion | 逐元素相减 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 |
-| SubGrad | 计算减法的梯度 | FP32 | - | - |
-| Switch | 根据布尔条件选择输出分支 | FP16
FP32
Int32
Bool | - | - |
-| SwitchLayer | 在模型中选择执行不同的子网络分支 | FP16
FP32
Int32
Bool | - | - |
-| TensorListFromTensor | 将普通张量转换为张量列表,按指定轴分割 | FP16
FP32
Int32 | - | - |
-| TensorListGetItem | 从张量列表中获取指定索引位置的张量 | FP16
FP32
Int32 | - | - |
-| TensorListReserve | 预分配一个空张量列表,指定元素数据类型和初始容量 | FP16
FP32
Int32 | - | - |
-| TensorListSetItem | 将张量插入张量列表的指定位置 | FP16
FP32
Int32 | - | - |
-| TensorListStack | 将张量列表堆叠为一个普通张量 | FP16
FP32
Int32 | - | - |
-| TensorScatterAdd | 根据索引将更新张量的值分散添加到目标张量的指定位置 | FP32
Int32 | - | - |
-| TileFusion | 平铺给定矩阵 | FP16
FP32
Int32
Bool | FP16 | - |
-| TopKFusion | 从输入张量中返回topK个元素 | FP16
FP32
Int32
Int8
UInt8 | - | - |
-| Transpose | Tensor转置 | FP16
FP32
Int32
Int8
Bool | FP16 | FP16
FP32 |
-| UniformReal | 生成服从均匀分布的随机数张量 | FP32
Int32 | - | - |
-| Unique | 返回输入张量中的唯一值,并可返回该值的索引和计数 | FP16
FP32
Int32 | - | - |
-| UnsortedSegmentSum | 对张量进行分段求和,不要求分段索引有序 | FP16
FP32
Int32 | - | - |
-| Unsqueeze | 将输入张量添加一个新的维度 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 |
-| Unstack | 沿指定轴拆分张量为多个子张量 | FP16
FP32
Int32 | - | - |
-| Where | 元素选择 | FP16
FP32
Int32
Bool | - | - |
-| ZerosLike | 生成与输入张量形状相同但全为 0的新张量 | FP16
FP32
Int32 | - | - |
+| 算子名称 | 算子功能 | CPU | Kirin NPU | GPU(Mali/Adreno) | Ascend |
+| ----------------------------------- | ------------------------------------------------------------ | --------------------------------------------------- | --------- | ----------------------- | ----------------------- |
+| Abs | 逐元素计算绝对值 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| AbsGrad | 计算绝对值函数的梯度 | FP32 | - | - | |
+| Activation | 激活函数 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| ActivationGrad | 计算特定激活函数的梯度 | FP16
FP32 | - | - | |
+| Adam | 执行Adam优化器的一次参数更新步骤 | FP32 | - | - | |
+| AddFusion | 逐元素计算加法 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int8 | FP16 |
+| AdderFusion | 基于加法的卷积运算 | FP32 | - | - | |
+| AddGrad | 计算加法操作的梯度 | FP32 | - | - | |
+| AddN | 对N个相同形状和数据类型的输入张量进行逐元素相加 | FP16
FP32 | - | - | |
+| Affine | 对输入张量执行仿射变换 | FP32 | - | - | FP16 |
+| All | 判断张量中所有元素在指定维度上是否都为True(非零) | FP32 | - | - | |
+| AllGather | 分布式集合通信操作 | FP32 | - | - | |
+| ApplyMomentum | 执行带动量的随机梯度下降的一次参数更新步骤 | FP32 | - | - | FP16 |
+| Assert | 断言 | FP16
FP32
Bool | - | - | |
+| Assign | 将一个值赋值给一个变量 | FP32 | - | - | FP16 |
+| ArgmaxFusion | 求某一维度最大值 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| ArgminFusion | 求某一维度最小值 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 |
+| AvgPoolFusion | 平均池化 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| AvgPoolGrad | 计算平均池化层的梯度 | FP16
FP32 | - | - | |
+| BatchNorm | 批量归一化 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 |
+| BatchNormGrad | 计算批量归一化层的梯度 | FP16
FP32 | - | - | |
+| BatchToSpace | 空间到批次变换的逆操作 | FP32
Int8
UInt8 | - | FP16
FP32 | |
+| BatchToSpaceND | BatchToSpace的ND通用版本 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | |
+| BiasAdd | 将偏置向量添加到输入张量 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 |
+| BiasAddGrad | 计算BiasAdd操作的梯度 | FP16
FP32 | - | - | |
+| BinaryCrossEntropy | 计算二元交叉熵损失 | FP32 | - | - | FP16 |
+| BinaryCrossEntropyGrad | 计算二元交叉熵损失函数的梯度 | FP32 | - | - | |
+| BroadcastTo | 扩维 | FP16
FP32
Int32
Bool | - | - | |
+| Call | 调用一个子计算图或函数 | FP16
FP32
Int32
Bool | - | - | FP16 |
+| Cast | 数据类型转换 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32 | FP16 |
+| Ceil | 向上取整 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Clip | 限制元素范围 | FP32
Int32 | - | - | FP16 |
+| Concat | 拼接张量 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | FP16 |
+| ConstantOfShape | 生成一个与输入形状相同的张量,并用指定常量填充 | FP16
FP32
Int32 | - | - | |
+| Conv2DFusion | 2D卷积 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Conv2DBackpropFilterFusion | 计算普通卷积操作对卷积核的梯度 | FP16
FP32 | - | - | |
+| Conv2DBackpropInputFusion | 计算普通卷积操作对输入数据的梯度 | FP16
FP32 | - | - | |
+| Conv2dTransposeFusion | 执行转置卷积运算 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Cos | 逐元素计算余弦 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Crop | 从输入图像或特征图中裁剪出一个指定区域 | FP16
FP32
Int32
Int8
UInt8 | - | - | |
+| CropAndResize | 从输入图像中根据一组边界框裁剪出区域,然后将每个区域缩放到统一大小 | FP32 | FP16 | - | |
+| CumSum | 累计元素和 | FP32
Int32 | - | - | FP16 |
+| CustomExtractFeatures | 自定义特征提取算子 | FP32 | - | - | |
+| CustomNormalize | 自定义归一化算子 | FP32 | - | - | |
+| CustomPredict | 自定义预测算子 | FP32
Int32 | - | - | |
+| DEConv2DGradFilter | 计算转置卷积对卷积核的梯度 | FP32 | - | | |
+| DepthToSpace | 将深度数据重新排列到空间维度中 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | |
+| DetectionPostProcess | 目标检测后处理 | FP32
Int8
UInt8 | - | - | |
+| DivFusion | 逐元素除法 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| DivGrad | 计算除法操作的梯度 | FP32 | - | - | |
+| Dropout | 随机将输入张量的部分元素置 0 | FP16
FP32 | - | - | FP16 |
+| DropoutGrad | 计算Dropout操作的梯度 | FP16
FP32 | - | - | |
+| DynamicQuant | 动态将浮点张量量化为uint8类型 | FP32 | - | - | |
+| Eltwise | 元素级运算 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Elu | 激活函数,对负输入使用指数修正 | FP16
FP32 | - | - | FP16 |
+| Equal | 判断输入是否相等 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| EmbeddingLookupFusion | 优化版的词嵌入查找,将整数索引映射为密集向量 | FP32 | - | - | |
+| Erf | 误差函数 | FP16
FP32 | - | - | FP16 |
+| ExpFusion | 逐元素取指数 | FP16
FP32 | - | FP16
FP32 | FP16 |
+| ExpandDims | 在指定位置插入长度为1的维度 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | FP16 |
+| Fill | 生成一个填充指定常量的张量 | FP16
FP32
Int32
Bool | - | FP16
FP32 | FP16 |
+| Flatten | 数据按维度展开 | FP16
FP32
Int32 | - | - | FP16 |
+| FlattenGrad | 计算Flatten操作的梯度 | FP16
FP32 | - | - | |
+| Floor | 向下取整 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| FloorDiv | 逐元素向下取整除法 | FP16
FP32
Int32 | FP16 | FP16
FP32 | |
+| FloorMod | 逐元素取模运算,结果的符号与除数一致 | FP16
FP32
Int32 | FP16 | FP16
FP32 | |
+| FullConnection | 全连接层 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| FusedBatchNorm | 对输入做标准化 | FP16
FP32
Int8
UInt8 | FP16 | - | FP16 |
+| GatherNd | 根据索引张量从输入张量中收集指定位置的元素 | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32 | FP16 |
+| Gather | 沿单一维度收集指定索引位置的元素 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | FP16 |
+| GatherD | 将输入tensor中的元素根据索引tensor进行收集 | FP16
FP32
Int32
Bool | - | - | FP16 |
+| GLU | 门控线性单元激活函数,将输入拆分为两部分并逐元素相乘 | FP32 | - | - | |
+| Greater | 逐元素比较两个张量,返回A>B的逻辑结果(True/False) | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| GreaterEqual | 逐元素比较两个张量,返回 A≥B的逻辑结果(True/False) | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| GroupNormFusion | 融合优化的组归一化 | FP32 | - | - | |
+| GRU | 门控循环单元,简化版LSTM | FP16
FP32 | - | - | |
+| HashtableLookup | 哈希表查找 | FP32
Int32 | - | - | |
+| InstanceNorm | 实例归一化 | FP16
FP32 | FP16 | - | FP16 |
+| InvertPermutation | 反转置换索引 | FP16
FP32
Int32 | - | - | |
+| IsFinite | 检测张量中每个元素是否为有限值(非inf/NaN) | FP32 | - | - | FP16 |
+| L2NormalizeFusion | 融合优化的L2归一化 | FP32
Int8
UInt8 | - | - | |
+| LayerNormFusion | 融合优化的层归一化 | FP16
FP32
Int8 | - | FP16
FP32 | FP16 |
+| LayerNormGrad | 计算层归一化的梯度 | FP16
FP32 | - | - | |
+| LeakyReLU | 带泄漏的 ReLU激活函数,对负输入给予微小斜率 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Less | 逐元素比较两个张量,返回 AFP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| LessEqual | 逐元素比较A ≤ B,返回布尔张量 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| LRN | 局部响应归一化 | FP32 | - | - | FP16 |
+| Log | 逐元素求对数 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Log1p | 计算log(1+X) | FP32 | - | - | FP16 |
+| LogGrad | 计算对数函数的梯度 | FP16
FP32 | - | - | |
+| LogicalAnd | 逐元素逻辑与运算 | FP16
FP32
Int32
Bool | FP16 | FP16
FP32 | |
+| LogicalNot | 逐元素逻辑非运算 | FP16
FP32
Int8
UInt8
Bool | FP16 | FP16
FP32 | |
+| LogicalOr | 逐元素逻辑或运算 | FP16
FP32
Bool | FP16 | FP16
FP32 | |
+| LogSoftmax | 对输入向量进行softmax操作,然后再对softmax结果取对数 | FP16
FP32 | - | - | FP16 |
+| LshProjection | 局部敏感哈希投影 | FP32 | - | - | |
+| LSTM | 长短期记忆网络单元 | FP16
FP32 | - | - | |
+| LSTMGrad | 计算LSTM对隐状态的反向传播梯度 | FP32 | - | - | |
+| LSTMGradData | 计算LSTM对输入数据的反向传播梯度 | FP32 | - | - | |
+| LSTMGradWeight | 计算LSTM对权重的反向传播梯度 | FP32 | - | - | |
+| MatMulFusion | 对2个输入做矩阵乘法运算 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Maximum | 取元素级最大值 | FP16
FP32
Int32 | FP16 | FP16
FP32 | FP16 |
+| MaximumGrad | 计算最大值函数的梯度 | FP16
FP32 | - | - | |
+| MaxPoolFusion | 最大池化 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| MaxPoolGrad | 计算最大池化层的梯度 | FP16
FP32 | - | - | |
+| Merge | 创建一个与输入张量X形状完全相同但所有元素值均为1的新张量 | FP16
FP32 | - | - | |
+| Minimum | 取元素级最小值 | FP16
FP32
Int32 | FP16 | FP16
FP32 | FP16 |
+| MinimumGrad | 计算最小值函数的梯度 | FP16
FP32 | - | - | |
+| Mod | 返回除法元素的余数 | FP32
Int32 | - | - | FP16 |
+| MulFusion | 逐元素乘法 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| MulGrad | 计算乘法操作的梯度 | FP32 | - | - | |
+| Neg | 逐元素求负数 | FP16
FP32
Int32 | FP16 | FP16
FP32 | FP16 |
+| NegGrad | 计算取负操作的梯度 | FP16
FP32 | - | - | |
+| NLLLoss | 计算负对数似然损失 | FP32 | - | - | FP16 |
+| NLLLossGrad | 计算NLLLoss的梯度 | FP32 | - | - | |
+| NotEqual | 逐元素比较两个张量,返回 A != B的逻辑结果 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | |
+| NonMaxSuppression | 非极大值抑制 | FP32 | - | - | FP16 |
+| NonZero | 返回输入张量中所有非零元素的索引 | Bool | - | - | FP16 |
+| OneHot | 将整数索引张量转换为独热编码表示 | FP16
FP32
Int32 | - | FP16
FP32
Int32 | |
+| OnesLike | 创建一个与输入张量 X形状完全相同但所有元素值均为1的新张量 | FP16
FP32
Int32 | - | - | FP16 |
+| PadFusion | 将输入张量加上指定的padding,使其达到指定的大小 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| PartialFusion | 部分融合 | FP16
FP32
Int32
Bool | - | - | |
+| PowFusion | 逐元素求幂 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 |
+| PowerGrad | 计算幂运算的梯度 | FP32 | - | - | |
+| PriorBox | 生成先验框 | FP32
Int8
UInt8 | - | - | FP16 |
+| PReLUFusion | PRelu激活函数 | FP16
FP32 | - | FP16
FP32 | FP16 |
+| QuantDTypeCast | 执行量化数据类型转换 | FP16
FP32
Int8
UInt8 | - | - | |
+| RaggedRange | 生成非均匀间隔的序列 | FP16
FP32
Int32 | - | - | |
+| RandomNormal | 生成一个张量,其中的值从正态分布中随机采样 | FP16
FP32 | - | - | |
+| RandomStandardNormal | 生成服从标准正态分布的随机数张量 | FP16
FP32 | - | - | |
+| Range | 生成某个区间内的元素 | FP16
FP32
Int32 | - | - | FP16 |
+| Rank | 返回输入张量的维度数 | FP16
FP32 | - | - | |
+| RealDiv | 逐元素除法 | FP16
FP32 | - | - | FP16 |
+| Reciprocal | 返回倒数 | FP16
FP32
Int8 | FP16 | - | FP16 |
+| ReduceFusion | 归约操作 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32 | FP16 |
+| ReduceScatter | 分布式操作,将输入张量分段后分发到各设备,每设备仅保留一段结果 | FP32 | - | - | |
+| Reshape | 改变张量形状,总元素个数不变 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | FP16 |
+| Resize | 对输入张量进行上采样或调整大小 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | |
+| ResizeGrad | 计算Resize的梯度 | FP16
FP32 | - | - | |
+| ReverseV2 | 沿指定轴反转张量 | FP32
Int32 | - | - | |
+| ReverseSequence | 对输入张量的可变长度序列进行部分反转 | FP32 | - | - | FP16 |
+| ROIPooling | 区域兴趣池化 | FP32 | - | - | FP16 |
+| Round | 四舍五入到最接近的整数数值 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Rsqrt | 逐元素计算平方根倒数,用于归一化 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | |
+| RsqrtGrad | 计算平方根倒数的梯度 | FP32 | - | - | |
+| Select | 根据条件从两个张量中选择元素 | FP32
Bool | - | - | |
+| Selu | 自归一化指数线性单元激活函数 | - | - | - | |
+| ScaleFusion | 将缩放操作与相邻算子融合 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| ScatterNd | 根据索引将更新张量中的值散射到输出张量的指定位置 | FP16
FP32
Int32 | - | - | FP16 |
+| ScatterNdUpdate | 使用给定值以及输入索引更新输入数据的值 | FP16
FP32
Int32 | - | - | |
+| SGD | 随机梯度下降优化器 | FP32 | - | - | FP16 |
+| Shape | 获得张量shape | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32 | FP16 |
+| SigmoidCrossEntropyWithLogits | 结合Sigmoid激活和交叉熵损失 | FP32 | - | - | FP16 |
+| SigmoidCrossEntropyWithLogitsGrad | 计算带Sigmoid的交叉熵损失的梯度 | FP32 | - | - | FP16 |
+| Sin | 逐元素计算正弦 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| Size | 获取张量维度大小 | FP16
FP32
Int32 | - | - | FP16 |
+| SliceFusion | 张量切片操作 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| SkipGram | Skip-gram模型的核心操作,用于词向量训练 | FP32 | - | - | |
+| SmoothL1Loss | 平滑L1损失 | FP32 | - | - | FP16 |
+| SmoothL1LossGrad | 计算平滑L1损失的梯度 | FP32 | - | - | |
+| Softmax | 归一化操作 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| SoftmaxGrad | 计算Softmax的梯度 | FP32 | - | - | |
+| Softplus | 平滑的ReLU变体 | FP16
FP32 | - | - | FP16 |
+| SpaceToBatch | 高度和宽度维度的值移至深度维度 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | FP16 |
+| SpaceToBatchND | 将空间维度的数据块拆分到批次维度 | FP16
FP32
Int8
UInt8 | - | FP16
FP32 | |
+| SpaceToDepth | 将空间数据重组为深度通道 | FP16
FP32 | - | FP16
FP32 | |
+| SparseToDense | 将稀疏表示转换为密集张量 | FP16
FP32
Int32 | - | FP16
FP32
Int32 | |
+| SparseSoftmaxCrossEntropyWithLogits | 稀疏标签的Softmax交叉熵 | FP32 | - | - | FP16 |
+| Splice | 沿指定轴连接输入张量的多个切片或范围 | FP16
FP32 | - | - | |
+| Split | 将输入张量沿指定轴分割成多个较小的输出张量 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| SplitWithOverlap | 带重叠的分割张量 | FP16
FP32 | - | - | |
+| Sqrt | 逐元素开根号 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| SqrtGrad | 计算平方根的梯度 | FP32 | - | - | |
+| Square | 逐元素平方 | FP16
FP32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| SquaredDifference | 逐元素计算 (A-B)² | FP16
FP32 | - | FP16
FP32 | |
+| Squeeze | 移除维度为1的维度 | FP16
FP32
Int32
Int8
UInt8
Bool | - | FP16
FP32
Int32 | |
+| StridedSlice | Tensor切片 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| StridedSliceGrad | 计算切片操作的梯度 | FP16
FP32 | - | - | |
+| Stack | 沿新轴堆叠多个张量 | FP16
FP32
Int32 | - | FP16
FP32 | FP16 |
+| SubFusion | 逐元素相减 | FP16
FP32
Int32
Int8
UInt8 | FP16 | FP16
FP32 | FP16 |
+| SubGrad | 计算减法的梯度 | FP32 | - | - | |
+| Switch | 根据布尔条件选择输出分支 | FP16
FP32
Int32
Bool | - | - | |
+| SwitchLayer | 在模型中选择执行不同的子网络分支 | FP16
FP32
Int32
Bool | - | - | |
+| TensorListFromTensor | 将普通张量转换为张量列表,按指定轴分割 | FP16
FP32
Int32 | - | - | |
+| TensorListGetItem | 从张量列表中获取指定索引位置的张量 | FP16
FP32
Int32 | - | - | |
+| TensorListReserve | 预分配一个空张量列表,指定元素数据类型和初始容量 | FP16
FP32
Int32 | - | - | |
+| TensorListSetItem | 将张量插入张量列表的指定位置 | FP16
FP32
Int32 | - | - | |
+| TensorListStack | 将张量列表堆叠为一个普通张量 | FP16
FP32
Int32 | - | - | |
+| TensorScatterAdd | 根据索引将更新张量的值分散添加到目标张量的指定位置 | FP32
Int32 | - | - | |
+| TileFusion | 平铺给定矩阵 | FP16
FP32
Int32
Bool | FP16 | - | FP16 |
+| TopKFusion | 从输入张量中返回topK个元素 | FP16
FP32
Int32
Int8
UInt8 | - | - | FP16 |
+| Transpose | Tensor转置 | FP16
FP32
Int32
Int8
Bool | FP16 | FP16
FP32 | FP16 |
+| UniformReal | 生成服从均匀分布的随机数张量 | FP32
Int32 | - | - | |
+| Unique | 返回输入张量中的唯一值,并可返回该值的索引和计数 | FP16
FP32
Int32 | - | - | |
+| UnsortedSegmentSum | 对张量进行分段求和,不要求分段索引有序 | FP16
FP32
Int32 | - | - | |
+| Unsqueeze | 将输入张量添加一个新的维度 | FP16
FP32
Int32
Int8
UInt8
Bool | FP16 | FP16
FP32
Int32 | |
+| Unstack | 沿指定轴拆分张量为多个子张量 | FP16
FP32
Int32 | - | - | |
+| Where | 元素选择 | FP16
FP32
Int32
Bool | - | - | |
+| ZerosLike | 生成与输入张量形状相同但全为 0的新张量 | FP16
FP32
Int32 | - | - | |