Automatic differentiation can calculate a derivative value of a derivative function at a certain point, which is a generalization of backpropagation algorithms. The main problem solved by automatic differentiation is to decompose a complex mathematical operation into a series of simple basic operations. This function shields a large number of derivative details and processes from users, greatly reducing the threshold for using the framework.
MindSpore uses ops.GradOperation
to calculate the first-order derivative. The ops.GradOperation
attributes are as follows:
get_all
: calculate the gradient. If it is equal to False, get the gradient of the first input. If it is equal to True, get the gradient of all inputs. The default value is False.get_by_list
: determines whether to derive the weight parameters. The default value is False.sens_param
: determines whether to scale the output value of the network to change the final gradient. The default value is False.This chapter uses ops.GradOperation
in MindSpore to find first-order derivatives of the function f(x)=wx+b.
Define the formula before deriving the input:
f(x)=wx+b
The example code below is an expression of Equation (1). Since MindSpore is functionally programmed, all expressions of computational formulas are represented as functions.
import numpy as np
import mindspore.nn as nn
from mindspore import Parameter, Tensor
class Net(nn.Cell):
def __init__(self):
super(Net, self).__init__()
self.w = Parameter(np.array([6.0]), name='w')
self.b = Parameter(np.array([1.0]), name='b')
def construct(self, x):
f = self.w * x + self.b
return f
Define the derivative class GradNet
. In the __init__
function, define the self.net
and ops.GradOperation
networks. In the construct
function, compute the derivative of self.net
. The following formula (2) is generated in MindSpore:
f′(x)=w
from mindspore import dtype as mstype
import mindspore.ops as ops
class GradNet(nn.Cell):
def __init__(self, net):
super(GradNet, self).__init__()
self.net = net
self.grad_op = ops.GradOperation()
def construct(self, x):
gradient_function = self.grad_op(self.net)
return gradient_function(x)
Finally, the weight parameter is defined as w, and a first-order derivative is found for the input parameter x in the input formula (1). According to the running result, the input in formula (1) is 6, that is:
f(x)=wx+b=6∗x+1
Derive the above equation:
f′(x)=w=6
x = Tensor([100], dtype=mstype.float32)
output = GradNet(Net())(x)
print(output)
[6.]
MindSpore calculates the first-order derivative using ops.GradOperation (get_all=False, get_by_list=False, sens_param=False)
. If get_all
is set to False
, the derivative of only the first input is calculated. If get_all
is set to True
, the derivative of all inputs is calculated.
To compute weight derivatives, you need to set get_by_list
in ops.GradOperation
to True
.
from mindspore import ParameterTuple
class GradNet(nn.Cell):
def __init__(self, net):
super(GradNet, self).__init__()
self.net = net
self.params = ParameterTuple(net.trainable_params())
self.grad_op = ops.GradOperation(get_by_list=True) # Set the first-order derivative of the weight parameters.
def construct(self, x):
gradient_function = self.grad_op(self.net, self.params)
return gradient_function(x)
Next, derive the function:
# Perform a derivative calculation on the function.
x = Tensor([100], dtype=mstype.float32)
fx = GradNet(Net())(x)
# Print the result.
print(f"wgrad: {fx[0]}\nbgrad: {fx[1]}")
wgrad: [100.]
bgrad: [1.]
If derivation is not required for some weights, set requires_grad
to False
when defining the derivation network and declaring the corresponding weight parameters.
class Net(nn.Cell):
def __init__(self):
super(Net, self).__init__()
self.w = Parameter(Tensor(np.array([6], np.float32)), name='w')
self.b = Parameter(Tensor(np.array([1.0], np.float32)), name='b', requires_grad=False)
def construct(self, x):
out = x * self.w + self.b
return out
class GradNet(nn.Cell):
def __init__(self, net):
super(GradNet, self).__init__()
self.net = net
self.params = ParameterTuple(net.trainable_params())
self.grad_op = ops.GradOperation(get_by_list=True)
def construct(self, x):
gradient_function = self.grad_op(self.net, self.params)
return gradient_function(x)
# Construct a derivative network.
x = Tensor([5], dtype=mstype.float32)
fw = GradNet(Net())(x)
print(fw)
(Tensor(shape=[1], dtype=Float32, value= [ 5.00000000e+00]),)
You can use the sens_param
parameter to scale the output value of the network to change the final gradient. Set sens_param
in ops.GradOperation
to True
and determine the scaling index. The dimension must be the same as the output dimension.
class GradNet(nn.Cell):
def __init__(self, net):
super(GradNet, self).__init__()
self.net = net
# Derivative operation.
self.grad_op = ops.GradOperation(sens_param=True)
# Scale an index.
self.grad_wrt_output = Tensor([0.1], dtype=mstype.float32)
def construct(self, x):
gradient_function = self.grad_op(self.net)
return gradient_function(x, self.grad_wrt_output)
x = Tensor([6], dtype=mstype.float32)
output = GradNet(Net())(x)
print(output)
[0.6]
You can use ops.stop_gradient
to stop calculating gradients. The following is an example:
from mindspore.ops import stop_gradient
class Net(nn.Cell):
def __init__(self):
super(Net, self).__init__()
self.w = Parameter(Tensor(np.array([6], np.float32)), name='w')
self.b = Parameter(Tensor(np.array([1.0], np.float32)), name='b')
def construct(self, x):
out = x * self.w + self.b
# Stop updating the gradient. The out does not contribute to gradient calculations.
out = stop_gradient(out)
return out
class GradNet(nn.Cell):
def __init__(self, net):
super(GradNet, self).__init__()
self.net = net
self.params = ParameterTuple(net.trainable_params())
self.grad_op = ops.GradOperation(get_by_list=True)
def construct(self, x):
gradient_function = self.grad_op(self.net, self.params)
return gradient_function(x)
x = Tensor([100], dtype=mstype.float32)
output = GradNet(Net())(x)
print(f"wgrad: {output[0]}\nbgrad: {output[1]}")
wgrad: [0.]
bgrad: [0.]
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。