Lessons from the Field : Packaging and deploying applications into Azure Batch nodes

Get the Code

Use this PS Script to create an Azure Batch Account and pool, and configure a Start up task.

Scenario

I’ve been working with a partner who wanted to run CPU intensive jobs on Azure Batch. The tasks they wanted to run are.NET programs composed of multiple files (100).
You could use the FileStaging feature to copy the files into the Nodes, but this approach might be cumbersome for large number of files. So what we decided to package all the dependencies into a zip file and configure a start up task to copy and unzip the file into the batch nodes.

This post explains how :
-Create an Azure Batch pool and configure a Start up task.
-The Start up task will download a zip file stored in Azure storage.
-Once the file is downloaded, we will unzip it so the program can be used in the batch tasks.

Step 1 – Create Startup task

 
function ConfigureBatchStartupTask(  [string] $compressedResourceFile,
                                     [string] $compressedResourceSaS 
                                     )
{

    ##############################################################################
    # Configure Batch Startup Task
    # This task will :
    #    1- Download a Zip file from Azure Storage
    #    2- Uncompress it
    #    The Zip File can contain all the tasks needed to execute the Batch Tasks
    ##############################################################################
    $destinyFolder = [System.IO.Path]::GetFileNameWithoutExtension($compressedResourceFile)
    $startUpTask = New-Object Microsoft.Azure.Commands.Batch.Models.PSStartTask
    $batchJobResources=  New-Object System.Collections.Generic.List[Microsoft.Azure.Commands.Batch.Models.PSResourceFile]    
    $batchJobMarlingProgramResource = New-Object Microsoft.Azure.Commands.Batch.Models.PSResourceFile `
                                         -ArgumentList @($compressedResourceSaS,$compressedResourceFile)
    $batchJobResources.Add($batchJobMarlingProgramResource)
    $startUpTask.ResourceFiles = $batchJobResources
    $startUpTask.WaitForSuccess = $true
    $startUpTask.CommandLine ="powershell.exe -nologo -noprofile -command ""& { Add-Type -A 'System.IO.Compression.FileSystem'; [IO.Compression.ZipFile]::ExtractToDirectory('$compressedResourceFile', 'C:usertasksshared$destinyFolder'); }"""

    $startUpTask
}

This startup task will download the zip file from the indicated URL.
I recommend to copy the zip file into a private Azure storage account, and then you can create a Read only Shared access signature to the file.

Step 2 – Create the Azure Batch Pool

Once you’ve created the Startup task, you can add it to the Azure Batch pool when you use the New-AzureBatchPool cmdlet.

function CreateBatchPool([Microsoft.Azure.Commands.Batch.Models.PSStartTask]  $startUpTask,
                         [Microsoft.Azure.Batch.Certificate]                  $accountCertificate,
                         [string] $poolDescription,
                         [string] $poolId ,
                         [string] $poolNodesSize ,
                         [int] $poolNodesNumber,
                         [int] $tasksPerNode

                        )
{

    #########################################################################
    # Create the AzureBatch Pool 
    # Autoscale formula checked every 15 minutes
    # Alternative  : Do it from Azure ServiceFabric, and use the PoolManager
    # To Confirm  : Can we use Autoscale formula and modify the pool size
    #########################################################################

    $certificateReference = New-Object Microsoft.Azure.Commands.Batch.Models.PSCertificateReference
    $certificateReference.StoreLocation = ([Microsoft.Azure.Batch.Common.CertStoreLocation]::CurrentUser)
    $certificateReference.StoreName="My"
    $certificateReference.Thumbprint = $accountCertificate.Thumbprint
    $certificateReference.ThumbprintAlgorithm ="sha1"
    $certificateReference.Visibility = ([Microsoft.Azure.Batch.Common.CertificateVisibility]::Task)
    $certficateReferencesList = new-object System.Collections.Generic.List``1[Microsoft.Azure.Commands.Batch.Models.PSCertificateReference]
    $certficateReferencesList.Add($certificateReference)

    #We will use an Auto-scale formula that will simply set a number of fixed nodes.  
    $autoscaleFormula = '$TargetDedicated='+$poolNodesNumber+';'

    New-AzureBatchPool -Id $poolId -VirtualMachineSize $poolNodesSize `
                       -OSFamily 4 -TargetOSVersion * `
                       -MaxTasksPerComputeNode  $tasksPerNode -DisplayName $poolDescription `
                       -AutoScaleFormula $autoscaleFormula `
                       -BatchContext $batchContext -StartTask $startUpTask `
                       -CertificateReferences $certficateReferencesList
}