The principal procedures of lite training is as follows:
MindIR
model file by using the cloud side APIs.MindIR
file to .ms model file.ms
model files.The model structure is saved in the transferred
ms
model file which will be load to the device platform for training.
The following figure shows the detailed training process:
For more javaAPI description, please refer to API Documentation.
Model is the main entrance of the MindSpore Lite framework. We can compile and execute graph models through Model
class.
A Model file is flatbuffer-serialized file which was converted using the MindSpore Model Converter Tool. These files have a .ms
extension. Before model training and/or inference, the model needs to be loaded from the file system and parsed. Related operations are mainly implemented in the [Graph]((https://www.mindspore.cn/lite/api/en/r2.0/api_java/graph.html#graph) class which holds the model data such as the network structure, weights data and operators attributes.
MSContext is a MindSpore Lite Object which contains basic configuration parameters required by the sessions to guide graph compilation and execution. It allows to define the device to run the model, e.g., CPU or GPU, the number of threads used for training and inference and the memory allocation scheme. Currently, only CPU device is supported in training.
Users can create the object of the class Model
by using the function Build
to call MindData APIs. The member function Build
of the class Model
, its prototype is as follows:
public boolean build(Graph graph, MSContext context, TrainCfg cfg);
The following codes show how to create a training session based on the multi-threads CPU by using the class Model
.
Graph g = new Graph();
MSContext context = new MSContext();
context.init(1, 0);
context.addDeviceInfo(DeviceType.DT_CPU, false, 0);
TrainCfg cfg = new TrainCfg();
cfg.init();
Model liteModel = new Model();
liteModel.build(g, context, cfg);
Refer Train a LeNet for more details.
Currently, java does not provide data processing API such as C++ Dataset
class and its extended classes. Users need to define the data preprocessing process by themselves. After processing the image or text data into byte data, copy it to the input of the model.
The following codes shows the Mnist data reading and data preprocessing process:
Currently, java does not provide data processing API such as C++ Dataset
class and its extended classes. Users need to define the data preprocessing process by themselves. After processing the image or text data into byte data, copy it to the input of the model.
public void readMNISTFile(String inputFileName, String labelFileName, Vector<DataLabelTuple> dataset) {
try {
BufferedInputStream ibin = new BufferedInputStream(new FileInputStream(inputFileName));
BufferedInputStream lbin = new BufferedInputStream(new FileInputStream(labelFileName));
byte[] bytes = new byte[4];
...
// read images
int image_size = n_rows * n_cols;
byte[] image_data = new byte[image_size];
for (int i = 0; i < lnumber; i++) {
byte [] hwc_bin_image = new byte[32 * 32];
readFile(ibin, image_data, image_size);
for (int r = 0; r < 32; r++) {
for (int c = 0; c < 32; c++) {
int index = r * 32 + c;
if (r < 2 || r > 29 || c < 2 || c > 29) {
hwc_bin_image[index] = 0;
} else {
hwc_bin_image[index] = image_data[(r-2)*28 + (c-2)];
}
}
}
DataLabelTuple data_label_tupel = new DataLabelTuple();
data_label_tupel.data = hwc_bin_image;
data_label_tupel.label = labels[i];
dataset.add(data_label_tupel);
}
} catch (IOException e) {
System.err.println("Read Dateset exception");
}
}
Refer Train a LeNet for more details.
MindSpore Lite java interface can obtain the output of the model through the interface provided by the Model
class. In the training mode, the output of the model is loss, and in the inference mode, the output of the model is the predicted value. The training and inference modes can be switched through the setTrainMode
interface. Execute the model through the runStep
interface.
Create the objects of the off-the-shelf functions and call the Train function of the class Model to train:
model.setTrainMode(true);
for (int i = 0; i < cycles; i++) {
for (int b = 0; b < virtualBatch; b++) {
fillInputData(ds.getTrainData(), false);
isSuccess = model.runStep();
if (!isSuccess) {
model.free();
System.err.println("run step failed");
return -1;
}
float loss = getLoss();
if (min_loss > loss) {
min_loss = loss;
}
if ((b == 0) && ((i + 1) % 500 == 0)) {
float acc = calculateAccuracy(10); // only test 10 batch size
if (max_acc < acc) {
max_acc = acc;
}
System.out.println("step_" + (i + 1) + ": \tLoss is " + loss + " [min=" + min_loss + "]" + " " +
"max_acc=" + max_acc);
}
}
}
Refer Train a LeNet for more details.
Similarly, switch to inference mode through the setTrainMode
interface:
model.setTrainMode(false);
With TrainSessions, a network can be used for both inference and training. These two modes differ in several aspects:
- The input of the network: Running inference requires only the data, while running training requires both data and labels.
- The output of the network: Running inference returns the predicted values in the output, while running in training mode returns the loss.
- In training mode, the weights of the layers are updated in each Run, while in inference mode they are static.
- Some layers behave differently in inference vs. training mode, e.g., updating the accumulated batch mean and variance in Batch Normalization layers.
The function prototype of setTrainMode
in the Model
class is as follows:
public boolean setTrainMode(boolean isTrain)
When MindSpore Lite is used for inference, if the input shape needs to be resized, you can call the Resize API of Model to resize the shape of the input tensor after a model is created and built.
Some networks do not support variable dimensions. As a result, an error message is displayed and the model exits unexpectedly. For example, the model contains the MatMul operator, one input tensor of the MatMul operator is the weight, and the other input tensor is the input. If a variable dimension API is called, the input tensor does not match the shape of the weight tensor. As a result, the training fails.
The following sample code demonstrates how to perform Resize on the input tensor of MindSpore Lite:
List<MSTensor> inputs = session.getInputs();
int[][] dims = {{1, 300, 300, 3}};
bool ret = model.resize(inputs, dims);
Before graph execution, whether it is during training or inference, the input data must be filled-in into the model input tensors. MindSpore Lite provides the following methods to obtain model input tensors:
Use the getInputByTensorName
method to obtain model input tensors that are connected to the model input node based on the tensor name.
/**
* Get input tensor by tensor name.
*
* @param tensorName name.
* @return input tensor.
*/
public MSTensor getInputByTensorName(String tensorName);
Use the getInputs
method to directly obtain the vectors of all model input tensors.
/**
* Get model inputs tensor.
*
* @return input tensors.
*/
public List<MSTensor> getInputs();
If the model requires more than one input tensor (this is certainly the case during training, where both data and labels serve as inputs of the network) it is the user's responsibility to know the inputs order or their tensorName. This can be obtained from the Python model. Alternatively, one can deduce this information from the sizes of the input tensors.
Copying Data
After model input tensors are obtained, the data must be copied into the tensors. The following methods allows to access the size of the data, the number of elements, the data type and the writable pointer. See also detailed description in the MSTensor API documentation.
// Assuming model is a valid instance of Model
List<MSTensor> inputs = model.getInputs();
imageTensor = inputs.get(dataIndex);
imageInputBuf = ByteBuffer.allocateDirect((int) imageTensor.size());
imageInputBuf.order(ByteOrder.nativeOrder());
After each execution of the graph, the user might want to read the model's outputs, whether it is the loss in the case of training mode, or the predicted output in the case of evaluation mode.
MindSpore Lite provides the following methods to obtain the model's output MSTensor
.
Use the getOutputsByNodeName
method to obtain the output tensors that belong to a certain node:
/**
* Get output tensors by node name.
*
* @param nodeName output node name
* @return output tensor
*/
public List<MSTensor> getOutputsByNodeName(String nodeName);
Use the getOutputByTensorName
method to obtain an output tensor, based on the tensor name.
/**
* Get output tensor names.
*
* @return output tensor name list.
*/
public MSTensor getOutputByTensorName(String tensorName);
Use the getOutputs
method to obtain all the output tensors, ordered by their tensor names.
/**
* Get model outputs.
*
* @return model outputs tensor.
*/
public List<MSTensor> getOutputs();
MindSpore Lite provides the export
interface to save the model, the prototype is as follows:
/**
* Export the model.
*
* @param fileName Name Model file name.
* @param quantizationType The quant type.0,no_quant,1,weight_quant,2,full_quant.
* @param isOnlyExportInfer if export only inferece.
* @param outputTensorNames tensor name used for export inference graph.
* @return Whether the export is successful.
*/
public boolean export(String fileName, int quantizationType, boolean isOnlyExportInfer,List<String> outputTensorNames);
You can load the saved model to perform re-training or inference.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。