MindSpore Serving is a lightweight and high-performance service module that helps MindSpore developers efficiently deploy online inference services in the production environment. After completing model training on MindSpore, you can export the MindSpore model and use MindSpore Serving to create an inference service for the model.
MindSpore Serving architecture:
MindSpore Serving includes two parts: Client
and Server
. On a Client
node, you can deliver inference service
commands through the gRPC or RESTful API. The Server
consists of a Main
node and one or more Worker
nodes.
The Main
node manages all Worker
nodes and their model information, accepts user requests from Client
s, and
distributes the requests to Worker
nodes. Servable
is deployed on a worker node, indicates a single model or a
combination of multiple models and can provide different services in various methods. `
On the server side, when MindSpore is used as the inference backend,, MindSpore Serving supports the Ascend 910/310P/310 and Nvidia GPU environments. When MindSpore Lite is used as the inference backend, MindSpore Serving supports Ascend 310, Nvidia GPU and CPU environments. Client` does not depend on specific hardware platforms.
MindSpore Serving provides the following functions:
batch size
requirement of the model.For details about how to install and configure MindSpore Serving, see the MindSpore Serving installation page.
MindSpore-based Inference Service Deployment is used to demonstrate how to use MindSpore Serving.
For more details about the installation guide, tutorials, and APIs, see MindSpore Python API.
Welcome to MindSpore contribution.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
Activity
Community
Health
Trend
Influence
:Code submit frequency
:React/respond to issue & PR etc.
:Well-balanced team members and collaboration
:Recent popularity of project
:Star counts, download counts etc.