Azure Site Recovery

Overview

I’ve been a systems engineer/administrator for almost 25 years.  Most of that time has been spent in the Microsoft ecosystem.  I’ve dabbled with Linux, Apple, and even Novell.  It seems I always return home to Microsoft.

My current employer is heavily invested in the Microsoft ecosystem.  We are mostly on Windows 10 version 1809 for our desktops and laptops.  We are moving to Windows Server 2019 as time allows.  Our collaboration suite is Office 365.  Our infrastructure is moving more and more into the Microsoft Azure cloud.  We are even investigating Microsoft Teams as an option for our enterprise phone system.

One of the systems I have been most impressed with is Microsoft Azure Site Recovery Services.  We have made great use of this product over the last two years or so and I’d like to tell you about our experiences.

ASR as a Cloud Migration Tool

I first investigated ASR as a method of moving our on-premises servers to the Microsoft IaaS cloud.  The need was for a low impact lift and shift of server resources.  There are several third party solutions in the wild and I’m sure many are very good.  Using native tools has always served me well so I chose to implement ASR.

This article will not get into the technical weeds.  I’m just going to talk about our approach in broad terms.  If you are looking for technical details I would point you to Microsoft’s documentation site.

Of course to start out you need a Microsoft Azure tenant.  My current organization is an Enterprise Agreement customer and we had already provisioned our Microsoft Azure tenant.  Within that tenant we have two subscriptions.  One for production and the other for dev/test.  For WAN connectivity we started with an Express Route site to site VPN.  This was fine for basic network connectivity but we quickly moved to a MPLS Express Route connection.  The workloads we wanted to move were in our datacenter in an underground colocation facility.

Our on-premises environment was making use of Vmware as the hypervisor.  ASR will work with Vmware or Hyper-V.  There were several server workloads we wanted to move to Azure.  Our end goal was to eliminate the colocation facility.

I began by building a server to be the on-premises migration server.  This server is the communication conduit between the on-premises hypervisor and the Azure IaaS cloud.  It takes care of the synchronization of your live servers into the cloud.  This server is connected to a storage account in Azure.  This storage account is where all of ASR is managed.

Once the infrastructure was in place I was able to start “protecting” on-premises servers.  The process is fairly simple.  In the Azure portal, you navigate to the storage account and then pick the server you want to protect.  ASR kicks off and starts syncing the data in the background.  I only did one server at a time for fear of impacting the WAN too greatly.  The on-premises sync server does some throttling to prevent saturating that WAN connection but I leaned toward the side of caution anyway.  Within a couple of weeks I had all of the on-premises workloads fully in sync to Azure.

Now, for many organizations, this might be the stopping point.  It may be that you simply want to use Azure as your emergency failover site in case of some disaster to your on-premises systems.  Azure Site Recovery works quite well for that.  I tested this scenario shortly after getting everything synced to the cloud.  I ran a test failover and brought up one of the servers in Azure.  Once DNS has synced, everything just works.  (Your servers will be on a different subnet in Azure)  Our goal however, was to make that failover permanent.

I started scheduling maintenance windows for the server failover process.  Again, this is dead simple.  In the Azure portal, you select the server you want to move and click the failover button.  After getting you to confirm the process Azure takes over.  After a final sync of any changes to the on-premises server, it is powered off.  Then the server in Azure is fired up.  Once it is online and has picked up an IP address in the Azure cloud, it is just a matter of waiting for DNS to update and sync to all of your domain controllers.  Server names all remain the same.  We didn’t experience any problems in moving over.

**One thing to note:  You need to make sure your on-premises servers are configured to use DHCP for IP address assignments.  I created DHCP reservations for each of my servers in advance.  If you don’t have the servers configured this way, they will boot using the old static IP address and you will not be able to get into the server easily to make changes.**

ASR for High Availability

Microsoft has built a highly resilient environment in Azure.  Their datacenters are amazing and I would encourage a tour if you ever have the opportunity.  Even with Microsoft it is still possible that there could be an event that could cause an entire datacenter to go dark.  Maybe a major natural disaster actually damages or destroys the facility.  Possibly a major Internet outage prevents connectivity to that region.  With this in mind we decided to be extra cautious.  We implemented Azure Site Recovery to sync our important production workloads from our main presence in the North Central region to the South Central region.

This configuration was brand new when we implemented it.  Microsoft took the same sort of processed used to provide high availability to on-premises environments and adapted it to function fully in Azure.  This solution does not require a sync server because it can leverage services within both Azure regions natively.

Today, we have all of our critical workloads fully protected and synced to the South Central region.  In our initial testing, we were able to fully fail all of those workloads over and have services restored and running in South Central in about fifteen minutes.  The main limitation here is again how long it takes DNS to fully update and replicate in your environment.

**Another site note:  If you happen to add a new volume to one of your IaaS servers that is protected by ASR in the cloud that new volume will not automatically be protected.  You need to go into the replication configuration for that server resource and add the new volume to the replication plan.**

Wrap Up

Would I do this all again?  You betcha.  Using ASR for migration to Azure and using it for a high availability solution has given me better sleep and peace of mind.  Combine that with Azure backup for those same servers and you end up with a very reliable solution.

Leave a Reply