Translator: JuLyAi
CPU
Model Development
When the built-in operators are not enough for developing the network, you can extend your custom CPU operators fast and conveniently using MindSpore's Python API and C++ API.
To add a custom operator, you need to complete 3 parts of the work, including operator primitives registration, operators implementation and operators information registration.
Among them:
This paper will take the custom Transpose
operator as an example to introduce the steps of customizing operators.
Each operator's primitive is a subclass inherited from the class PrimitiveWithCheck
, whose type name is the operator's name.
The CPU operator primitives are defined under the path mindspore/ops/operations
, and the appropriate file is selected according to the operator type. Definition of CPU operators' primitives' interface is as follows:
__init__
. Operators in this use case have no init attributes, thus __init__
has no additional input parameters.init_prim_io_names
.check_shape
function. Checking dtype of the output tensor is defined in check_dtype
function._checkparam
file defines a series of operations for validity checking, such as value checking, type checking, etc.Taking Transpose
operator's primitive as an example, the following example codes are given.
from mindspore.ops import PrimitiveWithInfer
class Transpose(PrimitiveWithInfer):
"""
The definition of the Transpose primitive.
"""
@prim_attr_register
def __init__(self):
"""Initialize Transpose"""
self.init_prim_io_names(inputs=['x', 'perm'], outputs=['output'])
def infer_shape(self, x, perm):
x_shape = x['shape']
p_value = perm['value']
if len(x_shape) != len(p_value):
raise ValueError('The dimension of x and perm must be equal.')
out_shapes = []
for i in p_value:
out_shapes.append(x_shape[i])
return out_shapes
def infer_dtype(self, x_dtype, perm_dtype):
return x_dtype
Usually, to implement a CPU operator needs to write a head file and a source file. The file path is mindspore/ccsrc/backend/kernel_compiler/cpu
. If the logical realization of the operator is by calling the third-party library MKL-DNN
, it will be placed in the subdirectory mkldnn
. Please refer to oneMkl and oneDNN for details.
The head file of the operator contains the registration information of the operator and the declaration of the class. The operator class inherits from the parent class of CPUKernel
and overloads InitKernel
and Launch
.
The source file of the operator is the implementation of the class. It mainly overloads the InitKernel and Launch functions. The head file example codes of the Transpose
operator are as follows:
class TransposeCPUFwdKernel : public CPUKernel {
public:
TransposeCPUFwdKernel() = default;
~TransposeCPUFwdKernel() override = default;
void InitKernel(const CNodePtr &kernel_node) override;
bool Launch(const std::vector<AddressPtr> &inputs, const std::vector<AddressPtr> &workspace,
const std::vector<AddressPtr> &outputs) override;
private:
std::vector<size_t> shape_;
std::vector<int> axis_;
};
InitKernel
contain a constant reference to the node pointer. Through the member function of the class AnfRuntimeAlgorithm
, the input and output shape of the operator node and the attribute information of the operator can be obtained.Launch
are 3 vectors, including all the input addresses, workspace addresses and all the output addresses, respectively. The concrete implementation logic of the operator is described in the function body.shape_
and axis_
are 2 member variables defined.The definition of the function InitKernel
in the source file is as follows:
void TransposeCPUFwdKernel::InitKernel(const CNodePtr &kernel_node) {
MS_EXCEPTION_IF_NULL(kernel_node);
shape_ = AnfAlgo::GetInputDeviceShape(kernel_node, 0);
axis_ = AnfAlgo::GetNodeAttr<std::vector<int>>(kernel_node, "perm");
if (shape_.size() != axis_.size()) {
MS_LOG(EXCEPTION) << "The size of input shape and transpose axis shape must be equal.";
}
}
AnfRuntimeAlgorithm
implement various operations on operator nodes. shape_
represents the shape of the first input of the operator. axis_
represents the attribute "perm" of the operator.Transpose
operator's primitive is as an input, but "perm" is actually considered as the attribute of the operation when parsing.For details of the class
AnfRuntimeAlgorithm
, please refer to the declaration in MindSpore source codes under mindspore/ccsrc/backend/session/anf_runtime_algorithm.h.
The definition of the function Launch
in the source file is as follows: First, get the address of each input and output in turn, and then transform the dimension according to axis_
, and assign the value to the space pointed to by the output address.
bool TransposeCPUFwdKernel::Launch(const std::vector<kernel::AddressPtr> &inputs,
const std::vector<kernel::AddressPtr> & /*workspace*/,
const std::vector<kernel::AddressPtr> &outputs) {
auto input = reinterpret_cast<float *>(inputs[0]->addr);
auto output = reinterpret_cast<float *>(outputs[0]->addr);
size_t size = IntToSize(inputs[0]->size / sizeof(float));
size_t shape_size = IntToSize(shape_.size());
if (shape_size > kMaxDim) {
MS_LOG(EXCEPTION) << "Input is " << shape_size << "-D, but transpose supports max " << kMaxDim << "-D inputs.";
}
size_t pos_array[kMaxDim];
size_t size_offset[kMaxDim];
size_offset[0] = size / shape_[0];
for (size_t i = 1; i < shape_size; i++) {
size_offset[i] = size_offset[SizeToInt(i) - 1] / shape_[i];
}
for (size_t position = 0; position < size; position += 1) {
size_t temp_position = position;
pos_array[0] = temp_position / size_offset[0];
for (size_t i = 1; i < shape_size; i++) {
temp_position -= pos_array[SizeToInt(i) - 1] * size_offset[i - 1];
pos_array[i] = temp_position / size_offset[i];
}
size_t new_position = pos_array[axis_[SizeToInt(shape_size) - 1]];
size_t new_position_size = 1;
for (int j = shape_size - 2; j >= 0; j--) {
new_position_size *= shape_[axis_[j + 1]];
new_position += pos_array[axis_[j]] * new_position_size;
}
output[new_position] = input[position];
}
return true;
}
Operators information is the key information to guide the back-end selection of implementing operators. The first parameter of MS_REG_CPU_KERNEL
is the name of the registration operator, which is consistent with the operator name in the primitives. The second parameter indicates the type of each input and output in turn. The last parameter is the name of the class which the operators implement. Transpose
operator registration codes are as follows:
MS_REG_CPU_KERNEL(Transpose, KernelAttr().AddInputAttr(kNumberTypeFloat32).AddOutputAttr(kNumberTypeFloat32),
TransposeCPUFwdKernel);
The number and order of the input and output information defined in operator information, the number and order of input and output information in operator implementation, and the number and order of input and output name list in operator primitives should be consistent.
After writing the custom CPU operators, you need to recompile and reinstall MindSpore. For details, please refer to Installation Document.
After compiling and installing, the custom CPU operators can be used directly through the import primitives. Take the single operator network test of Transpose
as an example.
Define the network in document test_transpose.py
.
import numpy as np
import mindspore.nn as nn
import mindspore.context as context
from mindspore import Tensor
import mindspore.ops as ops
context.set_context(mode=context.GRAPH_MODE, device_target="CPU")
class Net(nn.Cell):
def __init__(self):
super(Net, self).__init__()
self.transpose = ops.Transpose()
def construct(self, data):
return self.transpose(data, (1, 0))
def test_net():
x = np.arange(2 * 3).reshape(2, 3).astype(np.float32)
transpose = Net()
output = transpose(Tensor(x))
print("output: ", output)
Running case:
pytest -s test_transpose.py::test_net
Running results:
output: [[0, 3]
[1, 4]
[2, 5]]
If an operator needs to support automatic differentiation, its back-propagation function (bprop) needs to be defined in its primitives. You need to describe the reverse computing logic that uses forward input, forward output, and output gradient to get the input gradient in bprop. Reverse computation logic can be composed of built-in operators or custom reverse operators.
The following points should be paid attention to when defining operators' bprop functions:
For example, the bprop primitives of Transpose
are:
import mindspore as ms
import mindspore.ops as ops
from mindspore.ops._grad.grad_base import bprop_getters
fill = ops.Fill()
invert_permutation = ops.InvertPermutation()
transpose = ops.Transpose()
@bprop_getters.register(ops.Transpose)
def get_bprop_transpose(self):
"""Generate bprop for Transpose"""
def bprop(x, perm, out, dout):
return transpose(dout, invert_permutation(perm)), fill(ms.int32, (len(perm), ), 0)
return bprop
Transpose
bprop operator uses InvertPermutation
operator, which also needs a complete process of primitives, registration and implementation like Transpose
operator.Define the bprop case in document test_transpose.py
.
import mindspore.ops as ops
class Grad(nn.Cell):
def __init__(self, network):
super(Grad, self).__init__()
self.grad = ops.GradOperation(sens_param=True)
self.network = network
def construct(self, input_data, sens):
gout = self.grad(self.network)(input_data, sens)
return gout
def test_grad_net():
x = np.arange(2 * 3).reshape(2, 3).astype(np.float32)
sens = np.arange(2 * 3).reshape(3, 2).astype(np.float32)
grad = Grad(Net())
dx = grad(Tensor(x), Tensor(sens))
print("dx: ", dx.asnumpy())
Running case:
pytest -s test_transpose.py::test_grad_net
Running results:
dx: [[0. 2. 4.]
[1. 3. 5.]]
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。