Lessons from the Field: Developing secure Batch Jobs with Certificates and Azure KeyVault

Get the Code

Use the PS Script to create a Batch account and configure the Certificates

In this post we will cover how to deploy Certificates to Azure Batch, and use the certificates in a Batch Node to access secrets stored in Azure KeyVault.
This is a follow up from the post : Adopt a secure approach to manage your secrets by using Azure KeyVault,Azure Active Directory and Principals based on Certificates

Scenario

I’ve been working with a partner who is using Azure Batch to encrypt with Marlin video assets stored on Azure Media Services. Marlin is the format that my partner use to encrypt video assets, and Azure Batch is a great fit for running this sort of CPU intensive and long running jobs. They key benefits that Azure batch delivers on this scenario include:
1-Azure batch providers a powerful scheduler to assign jobs to Azure batch nodes / CPU cores
2-By using the Auto-scale formula, we can dynamically adapt the number of Batch nodes
according to factors like number of queued jobs, CPU, business hours etc.
3-A simple and effective way to create and manage large number of VMs.

Note : As you might know, Azure Media Services provides dynamic encryption capabilities for the two dominant DRM technologies : PlayReady and Widevine. By using Dynamic Encryption you can make significant cost savings as you won’t have to store multiple copies of your assets, so if your customer wants to use PlayReady or Widevine, I would encourage you to use Dynamic encryption instead of Azure Batch.

Steps :

-Step 0 : The Azure Batch Job will download a Media Services asset to the local Node : In order for us to do this in a secure way, we need the Media Services Keys.
– Step 1 : The Marlin exe file will be deployed on the Azure Nodes as a start-up task (we will explain this on another post)
– Step 2 : Once the video asset in on the local disk, we will use an exe to encrypt the video.
– Step 3 : Once the video has been encrypted, we will upload the encrypted asset to Azure Media Services and we will delete all the files from the local hard disks on the Azure Batch nodes.

Security is Paramount

In order to implement this scenario in a secure way, we really need to manage Azure Media Services keys in a secure way; so we implemented the following approach:
1. We will store the Media Services keys in Azure KeyVault. See post Adopt a secure approach to manage your secrets by using Azure KeyVault,Azure Active Directory and Principals based on Certificates
2. We will deploy the Certificate needed to authenticate into AAD / KeyVault to Azure Batch.

In this post we will cover Step 2.

Note : We could have encrypted the keys on KeyVault using the same certificate, but for the purpose of this exercise we didn’t do it.

Deploying Certificates to Azure Batch Accounts and Pools

This includes three steps :
Step 0 : Generate a Self-signed Certificate for Dev / Test.
Step 1 : Deploy your Certificate to your Azure Batch account
Step 2 : Reference the certificate when you create an Azure Batch Pool
Step 3 : Use the certificate from the application/task code to retrieve the secrets from KeyVault.

Step 0 : Generate a Self-signed Certificate for Dev / Test.

This was covered on the previous post : Adopt a secure approach to manage your secrets by using Azure KeyVault,Azure Active Directory and Principals based on Certificates , but for the purpose of simplicity I have added the script that we can use :

       $currentLocation = Get-Location
       $certificatePfxFilePath = [System.IO.Path]::Combine($currentLocation,$batchContext.AccountName +".pfx")
       $certificateCerFilePath = [System.IO.Path]::Combine($currentLocation,$batchContext.AccountName +".cer")

        # Generate Certificate if It doesn't exist        
        if(! [System.IO.File]::Exists($certificatePfxFilePath))
        {

           $newCer =  New-SelfSignedCertificate -CertStoreLocation Cert:CurrentUserMy `
                                                -DnsName $batchContext.AccountEndpoint  `
                                                -Provider "Microsoft Enhanced RSA and AES Cryptographic Provider" 
           $exportedCer = Export-PfxCertificate -FilePath $certificatePfxFilePath -Password  $certificatePassword -Cert $newCer
           $exportedCer = Export-Certificate    -FilePath $certificateCerFilePath  -Type CERT  -Cert $newCer

        }       
        $certificatePfxFilePath   

Step 1 : The Ops will deploy the certificate to the Azure Batch Account.

As this is typically an administrative step that IT Ops will do, I wanted to implement this step using a Powershell script that the IT Ops team could reuse when needed.
Unfortunately the Azure Batch PS cmdlets still don’t support this operation, so I ended up using the Azure Batch Library for .NET from PS. If you execute the PS script, you will also need to copy the library on the script / working folder.

The following function shows how to upload the certificate to an Azure Batch account :

function ConfigureBatchCertificates(
            [string]                                             $certificateFilePath,
            [Microsoft.Azure.Commands.Batch.BatchAccountContext] $batchContext ,
            [SecureString]                                       $certificatePassword

    )
{         

        #####################################################
        # Add Certificates to the Batch Account
        #####################################################          

        # Load AzureBatch .NET SDK
        $assembly = [reflection.assembly]::LoadWithPartialName("Microsoft.Azure.Batch")
        $batchCredentialsObject = New-Object Microsoft.Azure.Batch.Auth.BatchSharedKeyCredentials -ArgumentList @($batchContext.TaskTenantUrl,$batchContext.AccountName, $batchContext.PrimaryAccountKey)
        $batchClientObject = [Microsoft.Azure.Batch.BatchClient]::Open([Microsoft.Azure.Batch.Auth.BatchSharedKeyCredentials ] $batchCredentialsObject)
        $certificateClearPassword = (New-Object System.Management.Automation.PSCredential 'N/A', $certificatePassword).GetNetworkCredential().Password   

        # Add Certificate to AzureBatch Account
        $batchAccountCertificate = $batchClientObject.CertificateOperations.CreateCertificate($certificateFilePath,$certificateClearPassword)
        # If Certificate fails this will throw an error that can be ignored
        $batchAccountCertificate.Commit()
        $batchAccountCertificate
}

Step 1 : Developers will create Batch Pools and they will (just) reference the certificate Thumbprint.

Thumbprint.
Once the Certificate has been deployed to the Batch account, then developers can create Batch Pools and ask Azure Batch to deploy the Certificate on the Batch nodes.

function CreateBatchPool([Microsoft.Azure.Commands.Batch.Models.PSStartTask]  $startUpTask,
                         [Microsoft.Azure.Batch.Certificate]                  $accountCertificate,
                         [string] $poolDescription,
                         [string] $poolId ,
                         [string] $poolNodesSize ,
                         [int] $poolNodesNumber,
                         [int] $tasksPerNode

                        )
{

    #########################################################################
    # Create the AzureBatch Pool 
    # Autoscale formula checked every 15 minutes
    # Alternative  : Do it from Azure ServiceFabric, and use the PoolManager
    # To Confirm  : Can we use Autoscale formula and modify the pool size
    #########################################################################

    $certificateReference = New-Object Microsoft.Azure.Commands.Batch.Models.PSCertificateReference
    $certificateReference.StoreLocation = ([Microsoft.Azure.Batch.Common.CertStoreLocation]::CurrentUser)
    $certificateReference.StoreName="My"
    $certificateReference.Thumbprint = $accountCertificate.Thumbprint
    $certificateReference.ThumbprintAlgorithm ="sha1"
    $certificateReference.Visibility = ([Microsoft.Azure.Batch.Common.CertificateVisibility]::Task)
    $certficateReferencesList = new-object System.Collections.Generic.List``1[Microsoft.Azure.Commands.Batch.Models.PSCertificateReference]
    $certficateReferencesList.Add($certificateReference)

    #We will use an Auto-scale formula that will simply set a number of fixed nodes.  
    $autoscaleFormula = '$TargetDedicated='+$poolNodesNumber+';'

    New-AzureBatchPool -Id $poolId -VirtualMachineSize $poolNodesSize `
                       -OSFamily 4 -TargetOSVersion * `
                       -MaxTasksPerComputeNode  $tasksPerNode -DisplayName $poolDescription `
                       -AutoScaleFormula $autoscaleFormula `
                       -BatchContext $batchContext -StartTask $startUpTask `
                       -CertificateReferences $certficateReferencesList

With this script, once the batch nodes get created, Azure Batch engine will deploy the certificate on the Batch nodes, to be more precise on the My::CurentUser certificate store.
Also, if you did this correctly , you should see the Certificate Reference on Batch Pool Properties from the Azure Management Portal.

Azure Batch Certificates Properties

Step 3 : Use the certificate from the application/task code to retrieve the secrets from KeyVault.

This step was covered on the Step 5 on the post Adopt a secure approach to manage your secrets by using Azure KeyVault,Azure Active Directory and Principals based on Certificates.

The console app included in the repo, shows how to :
– Use the Certificate to authenticate into AAD
– Send the obtained bearer token to Azure KeyVault so the application (batch task) is authenticated.
– Deserialize the secret (json object) into a .NET object.