Licensed under the MIT License.

Impressions

Deploying a web service to Azure Kubernetes Service (AKS)

This notebook shows the steps for deploying a service: registering a model, creating an image, provisioning a cluster (one time action), and deploying a service to it. We then test and delete the service, image and model.

from azureml.core import Workspace
from azureml.core.compute import AksCompute, ComputeTarget
from azureml.core.webservice import Webservice, AksWebservice
from azureml.core.model import Model

import azureml.core
print(azureml.core.VERSION)

Get workspace

Load existing workspace from the config file info.

from azureml.core.workspace import Workspace

ws = Workspace.from_config()
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\n')

Register the model

#Register the model
from azureml.core.model import Model
model = Model.register(model_path = "sklearn_regression_model.pkl", # this points to a local file
                       model_name = "sklearn_regression_model.pkl", # this is the name the model is registered as
                       tags = {'area': "diabetes", 'type': "regression"},
                       description = "Ridge regression model to predict diabetes",
                       workspace = ws)

print(model.name, model.description, model.version)

Create the Environment

Create an environment that the model will be deployed with

from azureml.core import Environment
from azureml.core.conda_dependencies import CondaDependencies 

conda_deps = CondaDependencies.create(conda_packages=['numpy','scikit-learn==0.19.1','scipy'], pip_packages=['azureml-defaults', 'inference-schema'])
myenv = Environment(name='myenv')
myenv.python.conda_dependencies = conda_deps

Use a custom Docker image

You can also specify a custom Docker image to be used as base image if you don't want to use the default base image provided by Azure ML. Please make sure the custom Docker image has Ubuntu >= 16.04, Conda >= 4.5.* and Python(3.5.* or 3.6.*).

Only supported with python runtime.

# use an image available in public Container Registry without authentication
myenv.docker.base_image = "mcr.microsoft.com/azureml/o16n-sample-user-base/ubuntu-miniconda"

# or, use an image available in a private Container Registry
myenv.docker.base_image = "myregistry.azurecr.io/mycustomimage:1.0"
myenv.docker.base_image_registry.address = "myregistry.azurecr.io"
myenv.docker.base_image_registry.username = "username"
myenv.docker.base_image_registry.password = "password"

Write the Entry Script

Write the script that will be used to predict on your model

%%writefile score.py
import os
import pickle
import json
import numpy
from sklearn.externals import joblib
from sklearn.linear_model import Ridge

def init():
    global model
    # AZUREML_MODEL_DIR is an environment variable created during deployment.
    # It is the path to the model folder (./azureml-models/$MODEL_NAME/$VERSION)
    # For multiple models, it points to the folder containing all deployed models (./azureml-models)
    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'sklearn_regression_model.pkl')
    # deserialize the model file back into a sklearn model
    model = joblib.load(model_path)

# note you can pass in multiple rows for scoring
def run(raw_data):
    try:
        data = json.loads(raw_data)['data']
        data = numpy.array(data)
        result = model.predict(data)
        # you can return any data type as long as it is JSON-serializable
        return result.tolist()
    except Exception as e:
        error = str(e)
        return error

Create the InferenceConfig

Create the inference config that will be used when deploying the model

from azureml.core.model import InferenceConfig

inf_config = InferenceConfig(entry_script='score.py', environment=myenv)

Model Profiling

Profile your model to understand how much CPU and memory the service, created as a result of its deployment, will need. Profiling returns information such as CPU usage, memory usage, and response latency. It also provides a CPU and memory recommendation based on the resource usage. You can profile your model (or more precisely the service built based on your model) on any CPU and/or memory combination where 0.1 <= CPU <= 3.5 and 0.1GB <= memory <= 15GB. If you do not provide a CPU and/or memory requirement, we will test it on the default configuration of 3.5 CPU and 15GB memory.

In order to profile your model you will need:

a registered model
an entry script
an inference configuration
a single column tabular dataset, where each row contains a string representing sample request data sent to the service.

Please, note that profiling is a long running operation and can take up to 25 minutes depending on the size of the dataset.

At this point we only support profiling of services that expect their request data to be a string, for example: string serialized json, text, string serialized image, etc. The content of each row of the dataset (string) will be put into the body of the HTTP request and sent to the service encapsulating the model for scoring.

Below is an example of how you can construct an input dataset to profile a service which expects its incoming requests to contain serialized json. In this case we created a dataset based one hundred instances of the same request data. In real world scenarios however, we suggest that you use larger datasets with various inputs, especially if your model resource usage/behavior is input dependent.

You may want to register datasets using the register() method to your workspace so they can be shared with others, reused and referred to by name in your script. You can try get the dataset first to see if it's already registered.

import json
from azureml.core import Datastore
from azureml.core.dataset import Dataset
from azureml.data import dataset_type_definitions

dataset_name='sample_request_data'

dataset_registered = False
try:
    sample_request_data = Dataset.get_by_name(workspace = ws, name = dataset_name)
    dataset_registered = True
except:
    print("The dataset {} is not registered in workspace yet.".format(dataset_name))

if not dataset_registered:
    input_json = {'data': [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
                        [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]]}
    # create a string that can be put in the body of the request
    serialized_input_json = json.dumps(input_json)
    dataset_content = []
    for i in range(100):
        dataset_content.append(serialized_input_json)
    sample_request_data = '\n'.join(dataset_content)
    file_name = "{}.txt".format(dataset_name)
    f = open(file_name, 'w')
    f.write(sample_request_data)
    f.close()

    # upload the txt file created above to the Datastore and create a dataset from it
    data_store = Datastore.get_default(ws)
    data_store.upload_files(['./' + file_name], target_path='sample_request_data')
    datastore_path = [(data_store, 'sample_request_data' +'/' + file_name)]
    sample_request_data = Dataset.Tabular.from_delimited_files(
        datastore_path,
        separator='\n',
        infer_column_types=True,
        header=dataset_type_definitions.PromoteHeadersBehavior.NO_HEADERS)
    sample_request_data = sample_request_data.register(workspace=ws,
                                                    name=dataset_name,
                                                    create_new_version=True)

Now that we have an input dataset we are ready to go ahead with profiling. In this case we are testing the previously introduced sklearn regression model on 1 CPU and 0.5 GB memory. The memory usage and recommendation presented in the result is measured in Gigabytes. The CPU usage and recommendation is measured in CPU cores.

from datetime import datetime
from azureml.core import Environment
from azureml.core.conda_dependencies import CondaDependencies
from azureml.core.model import Model, InferenceConfig


environment = Environment('my-sklearn-environment')
environment.python.conda_dependencies = CondaDependencies.create(pip_packages=[
    'azureml-defaults',
    'inference-schema[numpy-support]',
    'joblib',
    'numpy',
    'scikit-learn==0.19.1',
    'scipy'
])
inference_config = InferenceConfig(entry_script='score.py', environment=environment)
# if cpu and memory_in_gb parameters are not provided
# the model will be profiled on default configuration of
# 3.5CPU and 15GB memory
profile = Model.profile(ws,
            'sklearn-%s' % datetime.now().strftime('%m%d%Y-%H%M%S'),
            [model],
            inference_config,
            input_dataset=sample_request_data,
            cpu=1.0,
            memory_in_gb=0.5)

# profiling is a long running operation and may take up to 25 min
profile.wait_for_completion(True)
details = profile.get_details()

Provision the AKS Cluster

This is a one time setup. You can reuse this cluster for multiple deployments after it has been created. If you delete the cluster or the resource group that contains it, then you would have to recreate it.

from azureml.core.compute import ComputeTarget
from azureml.core.compute_target import ComputeTargetException

# Choose a name for your AKS cluster
aks_name = 'my-aks-9' 

# Verify that cluster does not exist already
try:
    aks_target = ComputeTarget(workspace=ws, name=aks_name)
    print('Found existing cluster, use it.')
except ComputeTargetException:
    # Use the default configuration (can also provide parameters to customize)
    prov_config = AksCompute.provisioning_configuration()

    # Create the cluster
    aks_target = ComputeTarget.create(workspace = ws, 
                                    name = aks_name, 
                                    provisioning_configuration = prov_config)

if aks_target.get_status() != "Succeeded":
    aks_target.wait_for_completion(show_output=True)

Create AKS Cluster in an existing virtual network (optional)

See code snippet below. Check the documentation here for more details.

# from azureml.core.compute import ComputeTarget, AksCompute

# # Create the compute configuration and set virtual network information
# config = AksCompute.provisioning_configuration(location="eastus2")
# config.vnet_resourcegroup_name = "mygroup"
# config.vnet_name = "mynetwork"
# config.subnet_name = "default"
# config.service_cidr = "10.0.0.0/16"
# config.dns_service_ip = "10.0.0.10"
# config.docker_bridge_cidr = "172.17.0.1/16"

# # Create the compute target
# aks_target = ComputeTarget.create(workspace = ws,
#                                   name = "myaks",
#                                   provisioning_configuration = config)

Enable SSL on the AKS Cluster (optional)

See code snippet below. Check the documentation here for more details

# provisioning_config = AksCompute.provisioning_configuration(ssl_cert_pem_file="cert.pem", ssl_key_pem_file="key.pem", ssl_cname="www.contoso.com")

%%time
aks_target.wait_for_completion(show_output = True)
print(aks_target.provisioning_state)
print(aks_target.provisioning_errors)

Optional step: Attach existing AKS cluster

If you have existing AKS cluster in your Azure subscription, you can attach it to the Workspace.

# # Use the default configuration (can also provide parameters to customize)
# resource_id = '/subscriptions/92c76a2f-0e1c-4216-b65e-abf7a3f34c1e/resourcegroups/raymondsdk0604/providers/Microsoft.ContainerService/managedClusters/my-aks-0605d37425356b7d01'

# create_name='my-existing-aks' 
# # Create the cluster
# attach_config = AksCompute.attach_configuration(resource_id=resource_id)
# aks_target = ComputeTarget.attach(workspace=ws, name=create_name, attach_configuration=attach_config)
# # Wait for the operation to complete
# aks_target.wait_for_completion(True)

Deploy web service to AKS

# Set the web service configuration (using default here)
aks_config = AksWebservice.deploy_configuration()

# # Enable token auth and disable (key) auth on the webservice
# aks_config = AksWebservice.deploy_configuration(token_auth_enabled=True, auth_enabled=False)

%%time
aks_service_name ='aks-service-1'

aks_service = Model.deploy(workspace=ws,
                           name=aks_service_name,
                           models=[model],
                           inference_config=inf_config,
                           deployment_config=aks_config,
                           deployment_target=aks_target)

aks_service.wait_for_deployment(show_output = True)
print(aks_service.state)

Test the web service using run method

We test the web sevice by passing data. Run() method retrieves API keys behind the scenes to make sure that call is authenticated.

%%time
import json

test_sample = json.dumps({'data': [
    [1,2,3,4,5,6,7,8,9,10], 
    [10,9,8,7,6,5,4,3,2,1]
]})
test_sample = bytes(test_sample,encoding = 'utf8')

prediction = aks_service.run(input_data = test_sample)
print(prediction)

Test the web service using raw HTTP request (optional)

Alternatively you can construct a raw HTTP request and send it to the service. In this case you need to explicitly pass the HTTP header. This process is shown in the next 2 cells.

# # if (key) auth is enabled, retrieve the API keys. AML generates two keys.
# key1, Key2 = aks_service.get_keys()
# print(key1)

# # if token auth is enabled, retrieve the token.
# access_token, refresh_after = aks_service.get_token()

# construct raw HTTP request and send to the service
# %%time

# import requests

# import json

# test_sample = json.dumps({'data': [
#     [1,2,3,4,5,6,7,8,9,10], 
#     [10,9,8,7,6,5,4,3,2,1]
# ]})
# test_sample = bytes(test_sample,encoding = 'utf8')

# # If (key) auth is enabled, don't forget to add key to the HTTP header.
# headers = {'Content-Type':'application/json', 'Authorization': 'Bearer ' + key1}

# # If token auth is enabled, don't forget to add token to the HTTP header.
# headers = {'Content-Type':'application/json', 'Authorization': 'Bearer ' + access_token}

# resp = requests.post(aks_service.scoring_uri, test_sample, headers=headers)


# print("prediction:", resp.text)

Clean up

Delete the service, image and model.

%%time
aks_service.delete()
model.delete()

一米田/MachineLearningNotebooks

Deploying a web service to Azure Kubernetes Service (AKS)

Get workspace

Register the model

Create the Environment

Use a custom Docker image

Write the Entry Script

Create the InferenceConfig

Model Profiling

Provision the AKS Cluster

Create AKS Cluster in an existing virtual network (optional)

Enable SSL on the AKS Cluster (optional)

Optional step: Attach existing AKS cluster

Deploy web service to AKS

Test the web service using run method

Test the web service using raw HTTP request (optional)

Clean up

简介

发行版

贡献者

近期动态

一米田/MachineLearningNotebooks .gitee-modal { width: 500px !important; }

Deploying a web service to Azure Kubernetes Service (AKS)

Get workspace

Register the model

Create the Environment

Use a custom Docker image

Write the Entry Script

Create the InferenceConfig

Model Profiling

Provision the AKS Cluster

Create AKS Cluster in an existing virtual network (optional)

Enable SSL on the AKS Cluster (optional)

Optional step: Attach existing AKS cluster

Deploy web service to AKS

Test the web service using run method

Test the web service using raw HTTP request (optional)

Clean up

简介

发行版

贡献者

近期动态

搜索帮助

一米田/MachineLearningNotebooks