Business Continuity and Disaster Recovery (BCDR) for your WSO2 Platform with Microsoft Azure Site Recovery
In my previous article — Reference Architecture — Deploying WSO2 API Manager on Microsoft Azure, I discussed the different Azure workloads which can be used to deploy WSO2 platforms on the Azure cloud.
In this article, I’m going to discuss one of the strategies of implementing Busines Continuity and Disaster Recovery (BCDR) for your WSO2 Platform using the Microsoft Azure Site Recovery feature.
During outages, helping our businesses to keep doing business is the meaning of BCDR.
There are many ways/ strategies to implement BCRD for your WSO2 platform, and they can vary from very simple to complex strategies. And, they vary depending on where you have deployed them — bare-metal, VMs, Cloud, etc. Below are some of the commonly-used BCDR strategies in practice:
- Backup/ Restore — this can be simple as File System backup/ restore or can be complex as VM backup/ restore depending on the deployment. This approach can come in handy when an individual file or a set of files need to be restored/ replaced in case of corruption. This focuses on providing BCDR on smaller workload-related items such as files, database files, configuration files, etc.
- High Availability (HA) deployment — this can be Active/ Active or Active/ Passive deployment. This can be a local deployment where all nodes, including a secondary node(s), are deployed locally next to each other in the same region in the same datacenter, or it can be in different datancenters in different regions as well.
- Geo-redundancy — this is about setting up a Disaster Recovery (DR) site in standby mode in a different region. This can be another on-premise site or a cloud-based site that can be failed-over to during datacenter-wide catastrophic failures. This is the topic for this article.
Below is a typical deployment diagram for deploying WSO2 API Manager and Enterprise Integrator within a datacenter. This can be on on-premise on bare-metal, on-premise on VMs, or in the Cloud.
What If one of the VM clusters goes down or your entire datacenter goes down — imagine any catastrophic failures which could happen.
It’s best if we have a secondary datacenter as a DR site to provide BCDR and let’s look at how we can leverage Azure Site Recovery to achieve the same regardless of whether you are on on-premise or cloud.
Azure Site Recovery
Azure Site Recovery — offers ease of deployment, cost effectiveness, and dependability. Deploy replication, failover, and recovery processes through Site Recovery to help keep your applications running during planned and unplanned outages. Azure Site Recovery is a native disaster recovery as a service (DRaaS). Azure Site Recovery (ASR), Microsoft’s DRaaS solution, was named an industry leader by Gartner in 2019 for its completeness of vision and ability to execute.
Azure Site Recovery helps ensure business continuity by keeping business apps and workloads running during outages. Azure Site Recovery replicates workloads running on physical and virtual machines (VMs) from a primary site to a secondary location of your choice. When an outage occurs at your primary site, you failover to a secondary location, and access apps from there. After the primary location comes back online again, you can fail back to it.
Azure Site Recovery — Value Proposition
- Simple to deploy and manage — Set up Azure Site Recovery simply by replicating an On-premise server/ VM, AWS VM, or an Azure VM to a different Azure region directly from the Azure portal. As a fully integrated offering, Azure Site Recovery is automatically updated with new Azure features as they’re released. Minimize recovery issues by sequencing the order of multi-tier applications running on multiple virtual machines. Ensure compliance by testing your disaster recovery plan without impacting production workloads or end-users. And keep applications available during outages with automatic recovery from on-premises to Azure or Azure to another Azure region.
- Reduce infrastructure costs — Reduce the cost of deploying, monitoring, patching, and maintaining on-premises disaster recovery infrastructure by eliminating the need for building or maintaining a costly secondary datacenter. Plus, you pay only for the compute resources you need to support your applications in Azure.
- Minimize downtime with dependable recovery — Easily comply with industry regulations such as ISO 27001 by enabling Site Recovery between separate Azure regions. Scale coverage to as many business-critical applications as you need, backed by Azure’s service availability and support. Restore your most recent data quickly with Site Recovery.
The below deployment diagram depicts how we can implement BCDR for your WSO2 Platform using Azure Site Recovery.
Azure Site Recovery — Deployment Scenarios
Let’s look at different scenarios/ ways in which we can leverage Azure Site Recovery for implementing BCDR for your WSO2 deployment:
- Replication of physical/ bare-metal WSO2 servers from on-premises and third-party service providers to Azure
- Replication of Windows and Linux based WSO2 server VMs hosted in VMware and Hyper-V to Azure
- Replication of Windows-based WSO2 server VMs hosted in AWS to Azure
- Replication of Windows and Linux based WSO2 server VMs in Azure Stack to Azure
There are several factors that govern a BCDR strategy, such as Recovery Time Objective (RTO) and Recovery Point Objective (RPO) goals, storage (IOPS and storage account), capacity planning, network bandwidth, network reconfiguration, and daily change rate, etc. Below listed Azure Site Recovery Deployment Planning resources can be used to conduct initial planning exercises:
Also, don't forget to check the Support Matrix to understand the prerequisites and limitations of Azure Site Recovery.
Azure Site Recovery supports several source environments such as VMware (with or without vCenter), Hyper-V VMs (with or without SCVMM), physical servers, and Azure VMs. It also supports replicating of machines in other cloud service providers like AWS or from third-party hosting services using the same process that is used for protecting physical servers.
The replication will be done using Azure Recovery Services Vault. The Recovery Services Vault will house the replication settings and manage the replication. Initial replication can take quite some time and once it’s complete, Azure Site Recovery replicates data in incremental chunks (changed data) at an interval of your choice using a replication policy.
Once it’s replicated, it allows us to test the failover by running a Test Failover. A Test Failover can be done either through a recovery plan (to orchestrate failover of multiple machines) or manually for each VM through the Azure console. It also supports Planned and Unplanned failover options which involves shifting the production site traffic to the replication/ secondary site.
When your WSO2 platform becomes a key piece within your overall solution, you will have to think about how to face and continue your business without disruption during any catastrophic failures. This refers to Business Continuity and Disaster Recovery (BCDR). Implementing the right BCDR solution is non-negotiable to ensure business continuity and protect your workloads from unplanned events. There are many strategies for implementing BCDR. Setting up a Geo-redundant/ Secondary datacenter provides a way to failover from your primary datacenter to another during any catastrophic failures. However, setting up another physical datacenter can be very expensive, time-consuming, and difficult to manage/ maintain. And it requires a lot of initial planning and investments.
This is where Azure Sire Recovery comes to the rescue. Setting up Azure Site Recovery can be cost-effective, easy to setup, and easy to manage/ maintain. You can get started with Azure Site Recovery anytime and in hours, you can setup a DR site. The right Azure Site Recovery architecture can save you a lot of money while providing a first-class BCDR to your WSO2 platform.