MLOps In Azure…

13 min readJun 11, 2020

In Part 1 Of this Blog We already discussed ….What is MLOps ? Why MLOps ? DevOps vs MLOps?..

Now its time to do some hands-on with MLOps in Azure.

This series of posts will assume the reader has some knowledge & experience of data science workflows.

In this blog we’ll set up pipelines to create a minimal end-to-end MLOps pipelines to achieve the following using Azure Machine Learning and Azure Pipelines:

Across this series of posts, we will create 5 Azure Pipelines in this example project.

Each of these pipelines are on an automated trigger — triggered either by a change to code or on a schedule. However, if we make changes and want to trigger the pipelines manually, we can do so through Azure Pipelines.

The pipelines are as follows:

Data Pipeline

First we’ll have a data Pipeline to create a dataset and upload it to Azure Blob Storage. This datastore will then be registered with Azure Machine Learning ready for using in our model training pipeline.We’ll set this up as a daily pipeline.

Environment Pipeline

Second, we’ll create an Azure ML Environment using a custom python package. The custom python package we’ll use is in our Azure Repos git repository.

Often we’ll have one or more internal packages that will be used for sharing code for repeatable tasks such as custom data transformation we’d like to share across model training and model scoring.

We also install any external packages we need into this environment.

We’ll want to use this python package in our model training and model deployment, so each time we merge our code into our master branch, we’ll update the Azure ML Environment with our custom python package.

Model Training Pipeline

The third pipeline we’ll create is a model training Pipeline to train our ML model and register it to Azure ML Models.

In this pipeline we set up the compute node we’ll be using for training and, on this compute node, we pull in the environment we set up in the previous pipeline. We also mount the datastore we registered in our data pipeline for training our model.

We’ll have this pipeline on a schedule to retrain the model once per week.

Model Deployment Pipeline

Once we’ve got a trained, registered model, we’ll have a Model Deployment Pipeline to deploy the trained models to a web service using Azure Container Instances.

We will also run this once a week once our model training/retraining pipeline is complete.

CI Pipeline

The other pipelines are focused on the tasks that you’ll probably be more interested in as a data scientist. However, it’s important to note that you’ll often be working on code with others and this will be production code — so it’s important to enforce a set of coding standards and a suite of unit tests to run.

In our CI pipeline, every time a user makes a pull request in git, we’ll be running pylint for static code analysis and for automated running of tests. We’ll publish the test results and code coverage back to Azure DevOps for users to view before choosing to merge the code.

The following CI pipeline is the one I often use:

Why Azure Pipelines?

We’ll be using Azure Pipelines in this example to set up and trigger pipelines.

Azure Pipelines are cloud-hosted pipelines that are fully integrated with Azure DevOps. You can either use a yaml file or a UI-based tool in Azure DevOps to set up your pipelines.

They can be used with code stored in a range of repository locations, including Azure Repos and Github.

I’m personally a fan of Infrastructure as Code (IaC) so we’ll be using the yaml definitions of pipelines. As taken straight from the zen of python — “Explicit is better than implicit” — and I think you should only have to look at your code to know what will be deployed and run during these pipelines.

Azure Pipelines include a huge range of functions for running pipelines for tasks such as testing, static code analysis, building, deploying etc.

As an alternative, we could use Azure Data Factory, which includes a full assortment of integrated ETL tools, to manage the pipelines but as we won’t need any of this additional functionality and to keep resources to a minimum we’ll stick to Azure Pipelines.

Why Azure Machine Learning?

Azure Machine Learning provides a whole host of functionality to accelerate the end-to-end machine learning lifecycle to help deploy enterprise-ready ML solutions quickly and robustly.

We’ll be using Azure Machine Learning for a range of tasks concerning the management of the ML lifecycle including:

  • Registration of data sources and mounting of data storage to training environment
  • Storage and versioning of the environment in which to execute training and scoring
  • Managing compute for training experiments
  • Tracking training experiment runs and associated metrics
  • Storage and versioning of trained ML models
  • Deployment of models to a web service using Azure Container Instances (Azure Kuberenetes Service also available)

Set Up Azure DevOps

Navigate to dev.azure.com to sign into Azure DevOps.

You’ll be greeted with a page similar to the one below:

Click on the “+ New Project” Button. Fill in the name of your project and description and click Create:

Now navigate to Azure Repos within your new DevOps project, you should be greeted with a page that looks similar to the screenshot below. Click on the repo dropdown at the top of the page and click “Import Repository” to import the git repository that accompanies these posts.

The clone URL you’ll need to import the repository is https://dev.azure.com/bekeen/MLOps-Example-Project/_git/MLOps-Example-Project

Once imported, you’ll have the git repository in your Azure Repos within your Azure DevOps project, as below:

Create Azure Resources using Azure CLI

With our Azure DevOps Project set up, we can set up the rest of our resources using the Azure CLI.

Create a Resource Group

The first thing we’ll need to do is create a resource group in which to store all of our resources.

Using the Azure CLI tool run:

az group create --name <resource-group> --location <location>

Create an Azure Machine Learning Workspace

Next we’ll create an Azure Machine Learning Workspace, this workspace will manage:

  • Environments that can be used for training and scoring
  • We will be using a custom python package that we will be defining in this project.
  • This package will be pip installed into an Azure ML Environment.
  • We will manage our environment using the Azure ML Workspace.
  • Models
  • Trained models will be stored in our Workspace.
  • Our scoring script can then retrieve the latest models.
  • Datastores in which our training data is stored
  • We will be using a static dataset that we will create here.
  • This could, however, be an ever changing dataset in blob storage, Azure Datalake etc. from which retraining could be done.

Using the Azure CLI tool run:

az ml workspace create -w <workspace-name> -g <resource-group>

Create an Azure Storage Account

The Azure storage account will be used to store a train and test dataset, as well as a validation dataset

Using the Azure CLI tool run:

az storage account create --name <storage-account-name> \
--resource-group <resource-group> \
--location <location> \
--sku Standard_ZRS \
--encryption blob

We’ll need the key for this storage account so that we can store data to and retrieve data from this storage account. To get this key, we can run using the Azure CLI tool:

az storage account keys list --account-name <storage-account-name> --resource-group <resource-group>

Create a Service Principal with Password Authentication

Our service principal credentials can be used to authenticate access to our Azure ML Workspace.

Using the Azure CLI tool run:

az ad sp create-for-rbac --name <spn-name>

It is important to note down the app id and password from the creation of this service principal! You’ll also need the tenant ID listed here

Create an Azure Key Vault

We’ll use Azure Key Vault to store credentials.

In this example, we’ll only use the key vault for storing our Azure Storage Account credentials. However, the key vault could be used for storing the credentials for any number of Azure resources.

Using the Azure CLI tool run:

az keyvault create --name <keyvault-name> \
--resource-group <resource-group> \
--location <location>

Store Secrets in Azure Key Vault

Now we can store the Azure Storage Account Key in the Key Vault. Using the Azure CLI tool run:

az keyvault secret set --vault-name <keyvault-name> --name "StorageAccountKey" --value <storage-account-key>az keyvault secret set --vault-name <keyvault-name> --name "SpnPassword" --value <service-principle-password>

Give service principal access to Key Vault:

Now that we’ve got our service principal set up, we need to give this service principal access to our Key Vault

az keyvault set-policy -n <keyvault-name> \
--spn <service-principle-app-id> \
--secret-permissions get list set delete \
--key-permissions create decrypt delete encrypt get list unwrapKey wrapKey

Give Azure DevOps Access to Key Vault

Now we need to give Azure DevOps access to KeyVault so that our pipelines have access to Key Vault secret variables.

On Azure DevOps, navigate to “Library” under Pipelines and add a new Variable Group, you should be greeted with a screen as below:

Give your Variable Group a name and make sure to toggle on the button that says “Link secrets from an Azure key vault as variables”.

Then select your Azure subscription, Key vault name and add the variables you’d like to include (In this case "StorageAccountKey" and "SpnPassword").

Click “Save” to finish.

Add Azure Environment Variables to Azure DevOps

Using the method above, we’ll also add our other Azure Environment Variables to KeyVault.

Here you might do this multiple times with different variables for your different environments (Dev, Test, Prod etc.). We only have one environemnt so we’ll just do this assuming our environment is the production environment.

In addition to some names of resources you defined above, you’ll also need your Tenant ID and Subscription ID at this step, this can be shown on the Azure CLI as follows:

az account show

The Subscription ID can be found under the key "id" abd the Tenant ID can be found under the key "tenantId".

As above, navigate to “Library” under Pipelines and add a new Variable Group. This time don’t toggle the key vault button as you did above.

Now add variables for each of StorageAccountName, SpnID, TenantID, AmlWorkspaceName, SubscriptionID, and ResourceGroup as shown below (N.B. SpnID is the App ID for the service principle):

Click “Save” to finish.

Data Pipeline

This pipeline will be a data pipeline to create data and upload it to Azure Blob Storage. This datastore will then be registered with Azure Machine Learning ready for using in our model training pipeline.

In this example, this data will be simple and static, but this could be a continuously updating dataset that would undergo a more complex ETL pipeline at this stage that we can use to re-train models.

We’ll set this up as a daily pipeline, as if our data was being updated daily.

Pipeline definition

Our Azure Pipeline will be set up as a yaml file.

In our git repository, this can be found in the root of the repository as data_pipeline.yml.

Let’s take a look at this yaml file and then we’ll explain what’s happening at each step:

trigger: noneschedules:
- cron: "0 0 * * *"
displayName: "Daily midnight data pipeline run"
branches:
include:
- master
always: true
name: 'data_pipeline'
jobs:
- job: 'data_pipeline_job'
pool:
vmImage: 'ubuntu-16.04'
variables:
- group: KeyVault
- group: ProductionEnvVars
steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '3.7.6'
architecture: 'x64'
- script: |
python -m pip install --upgrade pip
pip install -r requirements.txt
displayName: 'Install requirements'
- script: |
python src/my_custom_package/create_data.py
displayName: 'Create and Register Data'
env:
STORAGE_ACCT_NAME: $(StorageAccountName)
STORAGE_ACCT_KEY: $(StorageAccountKey)
TENANT_ID: $(TenantID)
SPN_ID: $(SpnID)
SPN_PASSWORD: $(SpnPassword)
AML_WORKSPACE_NAME: $(AmlWorkspaceName)
RESOURCE_GROUP: $(ResourceGroup)
SUBSCRIPTION_ID: $(SubscriptionID)

So if we take a dive into at what’s happening here:

Schedules

First, the CI trigger is turned off, so that this Pipeline isn’t run every time the code is updated.

There is a cron schedule that runs at 00:00 every day on the master branch. This is run regardless of whether there are any code changes, because there may be data changes.

Jobs

We have set up a pipeline with a single stage, with a single job.

The VM image being used is an Ubuntu 16.04 image.

Variables are extracted from the variable groups we set up in our resource set up in Azure DevOps.

Steps

Python 3.7.6 is being used here. You can define a strategy in which you use multiple python versions (and multiple operating systems) but in this case we’re just using one version of python on one OS.

In the second step, we upgrade pip and install the requirements for our pipeline using the requirements.txt file found in the root of our repository.

In the final step we run the python file at src/my_custom_package/create_data.py in our repository. In the section below, we’ll take a look at what this file is doing.

Note at this last step that we are extracting variables from our variable groups and assigning them as environment variables for this script to use.

Create Data

Running the file src/my_custom_package/create_data.py will:

  • Retrieve azure resource details and keys from the environment variables
  • Create our training, test and validation data sets
  • Upload these datasets to our Azure Storage Account
  • Register the datasets

This is seen in our main function:

def main():
# Retrieve vars from env
storage_acct_name = os.environ['STORAGE_ACCT_NAME']
storage_acct_key = os.environ['STORAGE_ACCT_KEY']
tenant_id = os.environ['TENANT_ID']
spn_id = os.environ['SPN_ID']
spn_password = os.environ['SPN_PASSWORD']
workspace_name = os.environ['AML_WORKSPACE_NAME']
resource_group = os.environ['RESOURCE_GROUP']
subscription_id = os.environ['SUBSCRIPTION_ID']
# Instantiate Blob Storage Interface
blob_storage_interface = BlobStorageInterface(
storage_acct_name, storage_acct_key
)
# Create and Upload data to Blob Store
data_creator = CreateClassificationData()
data_creator.upload_data(blob_storage_interface)
# Register Blob Store to AML
aml_interface = AMLInterface(
tenant_id, spn_id, spn_password, subscription_id,
workspace_name, resource_group
)
aml_interface.register_datastore(
TRAINING_CONTAINER, TRAINING_DATASTORE,
storage_acct_name, storage_acct_key
)

When the CreateClassificationData() class is instantiated it creates some dummy classification data for us to use to create a classification machine learning model:

class CreateClassificationData():
def __init__(self):
x_arr, y_arr = make_classification(
n_samples=5000,
n_features=10,
n_classes=2,
random_state=1
)
col_names = ['A', 'B', 'C', 'D', 'E',
'F', 'G', 'H', 'I', 'J']
x_df = pd.DataFrame(x_arr, columns=col_names)
y_df = pd.DataFrame({'Target': y_arr})
# Training set n=3500
self.x_train = x_df.iloc[:3500]
self.y_train = y_df.iloc[:3500]
# Testing set n=750
self.x_test = x_df.iloc[3500:4250]
self.y_test = y_df.iloc[3500:4250]
# Validation set n=750
self.x_valid = x_df.iloc[4250:]
self.y_valid = y_df.iloc[4250:]

We create 3 sets of data — a training set, a test set and a validation set of data. When we call the upload_data method of our CreateClassificationData class, this uploads these three sets of data to the blob store:

def upload_data(self, blob_storage_interface):
self.upload_training_data(blob_storage_interface)
self.upload_evaluation_data(blob_storage_interface)
self.upload_validation_data(blob_storage_interface)

Set Up Pipeline on Azure DevOps

First you’ll need to have set up the resources as described in part 1.

Now to set up the pipeline, first navigate to Pipelines:

Then click on “New Pipeline”, you’ll be greeted with the screen below. Assuming you’re using Azure Repos git repositories, select the top option:

Once you’ve selected your code source, you’ll select which repository you’re using:

Then you need to configure your pipeline. We’ll be using a yaml file so select the bottom option below (“Existing Azure Pipelines YAML File”):

When you click on this, you have the option to select a yaml file. In this case we’ll be using data-pipeline.yml, so select this yaml file:

You’ll then be taken to a page in which you can review this yaml file. Once you’ve confirmed it’s all okay, click on “Run”:

You may need to grant your pipeline access to your subscription to access the variables it will need to access.

The pipeline will then run and, if successful, you should see an output similar to the below:

Rename Pipeline

As we’ll want to name our pipeline to something more descriptive such as “Data-Pipeline”, on the Azure pipelines page, click on the ellipsis and rename the pipeline:

See Scheduled Runs

To see the runs that are scheduled for the week, click on the pipeline, you’ll then be taken to a page as shown below. Click on the ellipsis in the top-right and click “Scheduled runs”.

And as same steps we will create four more pipelines that are :

  • Environment Pipeline
  • Model Training Pipeline
  • Model Deployment Pipeline
  • Continuous Integration (CI) Pipeline

Then our end-to-end MLOps in Azure will be set-up with DevOps pipelines

Thank You

--

--

Aditya Soni
Aditya Soni

Written by Aditya Soni

DevOps & Cloud Engineer, Speaker, Mentor, Community Leader, 1 x (AWS, GCP, Azure), 6 x RedHat, CKA, KCNA

No responses yet