How to Backup Your Microsoft Azure Storage Accounts

Written by miguel-bernard | Published 2020/05/29
Tech Story Tags: dotnet | azure | devops | azure-devops | backup | cloudservices | cloud | code-quality | web-monetization

TLDR Miguel is passionate about teaching, developers' communities and everything related to Net.Net. Azure Storage Account is simple, works great, and has crazy SLA and redundancy capabilities. But it doesn't provide a point in time restore, so there's no way to recover data. There are some solutions that you can find over the internet, but unfortunately, none of them is a silver bullet. One of the common problems with previous solutions is the management of backups for multiple environments and notifications on failure.via the TL;DR App

Azure Storage Account is one of the foundation services of Azure. Pretty much all other services use it under the cover in a way or another. This service is simple, works great, and has crazy SLA and redundancy capabilities. However, it doesn't provide a point in time restore. Meaning that if you corrupt or delete some data, there's no way to recover it. There are some solutions that you can find over the internet, but unfortunately, none of them is a silver bullet. Let's explore and compare some of them.

Requirements

Let's first define some goals and requirements before we jump into solutions
  1. Backups must run on a schedule (e.g., every hour)
  2. Backups must be fast
  3. Minimize network traffic $$$
  4. The solution must be easy to maintain and operate
  5. Get notification or report on failures
  6. Must support multi-environment configuration (dev, QA, prod, etc...)
  7. Has support for both Blob and Table storage format
AzCopy
AzCopy is a great tool to move data around in Azure, and it's also what most people will suggest using for backups. Target a source and a destination backup storage account and voilà! The AzCopy command-line performs the processing of that operation inside Azure datacenter between the two target storage accounts instead of passing by the machine that runs AzCopy. This feature alone makes AzCopy an obvious choice as it covers 2 of the most important objectives:
  • No. 2: Backup must be fast
  • No. 3: Minimize network traffic $$$
That's why most backup solutions use this tool.

Solution 1: Running from your PC

AzCopy is a command-line tool, so it's pretty easy to run from your machine or an on-prem server to backup your storage accounts. You can create a windows schedule to run that command, and you are good to go. It works fine when you have a few storage accounts, but grows complex real fast when adding more storage.
Pros
  • Easy to set up and automate
Cons
  • No easy way to get notified on a failure
  • Not easy to set it up for multiple environments
  • Scalability

Solution 2: Azure Automation

Azure Automation allows you to run PowerShell script directly in Azure. With that service, you can easily create a script that launches AzCopy with the specified arguments to target your storage accounts.
Pros
  • Easy to set up a schedule
Cons
  • Not that simple to install a .exe and run it on their hosted agent (You need to run it with a Hybrid Runbook Worker)
  • Not easy to set it up for multiple environments
  • No notification on failures

Solution 3: Let's try thinking outside the box

What about using Azure DevOps? It seems crazy right, but let's look at it more closely. One of the common problems with previous solutions is the management of backups for multiple environments and notifications on failure. Another problem is to install AzCopy on the agent or machine that's going to run the backup. Good news though, AzCopy is already installed by default on all Azure DevOps agents.
So here's a general idea: Create a pipeline that runs on a schedule and performs backup for each environment.
1. Create a PowerShell module with utility methods to make the use of AzCopy easier and upload it to your git repository
EDIT: You can now download all these scripts from Github. Also, special thanks to my good friend Yohan who cleaned up the whole thing and added the restore functionality.
Backup-Storage-Accounts.psm1
function Build-AzCopyCmd
{
  param(
    [Parameter(Mandatory)]
    [string]$DestinationPath,
    [Parameter(Mandatory)]
    [PSCustomObject]$SrcCtx,
    [Parameter(Mandatory)]
    [PSCustomObject]$DestCtx,
    [Parameter(Mandatory)]
    [string] $AzCopyParam,
    [Parameter(Mandatory)]
    [string] $SourcePath
  )

  $srcStorageAccountKey = $SrcCtx.StorageAccount.Credentials.ExportBase64EncodedKey()
  $destStorageAccountKey = $DestCtx.StorageAccount.Credentials.ExportBase64EncodedKey()
  $destContainer = $DestCtx.StorageAccount.CreateCloudBlobClient().GetContainerReference($DestinationPath)
  return [string]::Format("""{0}"" /source:{1} /dest:{2} /sourcekey:""{3}""
      /destkey:""{4}"" $AzCopyParam", "C:\Program Files (x86)\Microsoft
      SDKs\Azure\AzCopy\AzCopy.exe", $SourcePath, $destContainer.Uri.AbsoluteUri,
      $srcStorageAccountKey, $destStorageAccountKey)
}

function Invoke-AzCopyCmd
{
  param(
    [Parameter(Mandatory)]
    [string]$AzCopyCmd
  )

  $result = cmd /c $AzCopyCmd
  foreach($s in $result)
  {
    Write-Host $s
  }

  if ($LASTEXITCODE -ne 0){
    Write-Error "Copy failed!";
    break;
  }
  else
  {
    Write-Host "Copy succeed!"
  }

  Write-Host "-----------------"
}

function Backup-Blobs
{
  param(
    [Parameter(Mandatory)]
    [string]$DestinationPath,
    [Parameter(Mandatory)]
    [PSCustomObject]$SrcCtx,
    [Parameter(Mandatory)]
    [PSCustomObject]$DestCtx,
    [Parameter(Mandatory)]
    [array] $SrcStorageContainers
  )

  Process {
    foreach ($srcStorageContainer in $SrcStorageContainers)
    {
      if($srcStorageContainer.Name -like '*$*')
      {
        Write-Host "-----------------"
        Write-Host "Skipping copy: $($srcStorageContainer.Name)"
        Write-Host "-----------------"
        Continue;
      }

      Write-Host "-----------------"
      Write-Host "Start copying: $($srcStorageContainer.Name)"
      Write-Host "-----------------"

      $blobDestinationPath = $DestinationPath + "/blobs/" + $srcStorageContainer.Name
      $azCopyParam = "/snapshot /y /s /synccopy"
      $sourcePath = $srcStorageContainer.CloudBlobContainer.Uri.AbsoluteUri
      $azCopyCmd = Build-AzCopyCmd -DestinationPath $blobDestinationPath -SrcCtx
          $SrcCtx -DestCtx $DestCtx -AzCopyParam $azCopyParam -SourcePath
          $sourcePath
      Invoke-AzCopyCmd -AzCopyCmd $AzCopyCmd
    }
  }
}

function Backup-Tables
{
  param(
    [Parameter(Mandatory)]
    [string]$DestinationPath,
    [Parameter(Mandatory)]
    [PSCustomObject]$SrcCtx,
    [Parameter(Mandatory)]
    [PSCustomObject]$DestCtx,
    [Parameter(Mandatory)]
    [array] $SrcStorageTables
  )

  Process {
    foreach ($srcStorageTable in $SrcStorageTables)
    {
      Write-Host "-----------------"
      Write-Host "Start copying: $($srcStorageTable.Name)"
      Write-Host "-----------------"

      $tableDestinationPath = $DestinationPath + "/tables/" + $srcStorageTable.Name
      $azCopyParam = "/y"
      $sourcePath = $srcStorageTable.CloudTable.Uri.AbsoluteUri
      $azCopyCmd = Build-AzCopyCmd -DestinationPath $tableDestinationPath -SrcCtx
          $SrcCtx -DestCtx $DestCtx -AzCopyParam $azCopyParam -SourcePath
          $sourcePath
      Invoke-AzCopyCmd -AzCopyCmd $AzCopyCmd
    }
  }
}

function Backup-StorageAccount
{
  param(
    [Parameter(Mandatory)]
    [PSCustomObject]$SrcCtx,
    [Parameter(Mandatory)]
    [PSCustomObject]$DestCtx
  )

  $currentDate = (Get-Date).ToUniversalTime().tostring('yyyy\/MM\/dd\/HH:mm')
  $SrcStorageAccountName = $srcCtx.StorageAccount.Credentials.AccountName
  $destinationPath = $SrcStorageAccountName + "/" + $currentDate

  $srcTables = Get-AzureStorageTable -Context $srcCtx

  if($srcTables)
  {
    Backup-Tables -DestinationPath $destinationPath -SrcCtx $SrcCtx
        -DestCtx $destCtx -SrcStorageTables $srcTables
  }

  $maxReturn = 250
  $token = $null

  do{
    $srcContainers = Get-AzureStorageContainer -MaxCount $maxReturn
        -ContinuationToken $token -Context $srcCtx

    if($srcContainers)
    {
      $token = $srcContainers[$srcContainers.Count -1].ContinuationToken;
      Backup-Blobs -DestinationPath $destinationPath -SrcCtx $SrcCtx -DestCtx
          $destCtx -SrcStorageContainers $srcContainers
    }
  }
  While ($token -ne $null)
}

function Get-StorageAccountContext
{
  param(
    [Parameter(Mandatory)]
    [string]$StorageAccountName,
    [Parameter(Mandatory)]
    [string]$StorageAccountResourceGroup
  )

  $storageAccountKey = (Get-AzureRmStorageAccountKey -ResourceGroupName
      $StorageAccountResourceGroup -AccountName $StorageAccountName).Value[0]
  return New-AzureStorageContext -StorageAccountName $StorageAccountName
      -StorageAccountKey $storageAccountKey
}

Export-ModuleMember -Function Get-StorageAccountContext
Export-ModuleMember -Function Backup-StorageAccount
2. Create a script to launch the backup
Backup-Storage-Account.ps1
$destCtx = Get-StorageAccountContext -StorageAccountName $destStorageAccountName
    -StorageAccountResourceGroup $destResourceGroup

  foreach($srcStorageAccount in $srcStorageAccounts)
  {
    $srcStorageAccountName = $srcStorageAccount.storageAccountName
    $srcResourceGroup = $srcStorageAccount.resourceGroup
    $srcCtx = Get-StorageAccountContext -StorageAccountName $srcStorageAccountName
        -StorageAccountResourceGroup $srcResourceGroup
    Backup-StorageAccount -SrcCtx $srcCtx -DestCtx $destCtx
  }
3. Create a task group in Azure DevOps
4. Add the Azure PowerShell task, so the script will run in an Azure context and will be authenticated to the Storage Accounts you want to backup (make sure that Azure DevOps has access to your Azure Subscription)
5. Here's the YAML of that task.
Note that I used variables to make it more reusable in different environments
  • $(subscription) -> So you can target different Azure subscriptions
  • $(RepositoryPath) -> Git repository path that contains the scripts
  • $(EnvironmentCode) -> Variable that can be used in your PowerShell script to make custom logic depending on the environment
  • steps:
    - task: AzurePowerShell@3
      displayName: 'Run Backup Azure PowerShell Script'
      inputs:
        azureSubscription: '$(subscription)'
        ScriptPath: '$(RepositoryPath)\src\Backup\Scripts\Backup-Storage-Accounts.ps1'
        ScriptArguments: '-EnvironmentCode $(EnvironmentCode)'
        FailOnStandardError: true
        azurePowerShellVersion: LatestVersion
6. Create a new Release build pipeline with all your environments as stages
7. Add an artifact and select Azure Repos
8. Set a schedule to kick off the backups on all environments
9. Configure each environment to use your Task Group
10. Configure your variables for each environment
11. You can now look at all the backup runs and get notifications for free when it fails (default behavior of Azure DevOps when a build fails)
Pros
  • Easy to set up for multiple environments with the release pipeline
  • Audit trail of all the changes in the release pipeline
  • Support for various environments
  • Overview of all the backup runs
  • Built-in notifications on failure
Cons
  • You need an Azure DevOps account (but you can subscribe to a free plan here)

Conclusion

Sometimes the solution is more straightforward than it looks. You need to leverage existing tools that you know. Try to think outside the box, and you'll find out that your absurd ideas might not be that crazy after all. An essential skill for a developer is the ability to solve business problems in new creative ways using technology.

Written by miguel-bernard | Miguel is passionate about teaching, developers' communities and everything related to .Net.
Published by HackerNoon on 2020/05/29