Monday, 28 August 2017

Disaster Recovery vs Failover

Hi All,

Today want to speak with you about protecting your ERP from accidents.

There are 2 possible way to protect you application from disaster, which are different by approach and recovery time:
  • Disaster Recovery (DR) - involves a set of policies, procedures and tools to enable the recovery infrastructure and systems following a natural or human-induced disaster. So basically disaster recovery assumes that disaster can happen and we just need to recover it as soon as possible.
  • Failover Protection - is automatic switching to a redundant computer server, system, hardware component or network upon the failure. Failover basically assume that stability of the system should not be affected by any disaster. 
Lets check how can we implement it in Acumatica.
In the explanations below i'm keeping away from network disaster, as it is a separate topic not related to Acumatica.

Disaster Recovery
First of all DR assumes that there might be a time frame for recovery. That time period can be different and depend only on client requirements, but in the same time it should be reasonable so you have time to execute restoration steps.
All recovery steps should be documented and approved with client.

In the worst case restoration process for Acumatica should looks similar to that one:
  • Find a spare server that can be used instead of broken one
  • Setup Microsoft Windows on new server
  • Install all prerequisites: IIS, .NET, SQL Server / MySQL.
  • Install encryption certificates on new server
  • Restore Production database from back up on database server.
  • Install Acumatica Configuration Wizard
  • Install new Acumatica instance and connect it to existing (restored) database.
  • Lunch Acumatica and publish customization, if any.
  • Ensure everything is is working fine locally
  • Go To Acumatica Portal, find your license and deactivate your old server. That is required as Acumatica license is linked to hardware and require update if you changing anything.
  • Apply license to new installed instance.
  • Update firewall, other settings, to replace old server with new one in the network
  • Test Acumatica from external network.

That restoration process can be significantly simplified:
  1. After initial setup you can make a server disk back up. (For example, you can use Acronis True Image for that). In that case you do not need to install anything after restoration of disk. Only database left.
  2. You can keep Acumatica on virtual machine, in that case restoration of Acumatica will means just restoration of VM and DB.
  3. Spare server can be pre-configured with Acumatica in trial mode. In that case you can just switch a server
  4. Acumatica and Database can be on different servers, so in case disaster, you may need to restore only DB or Application server.
In terms of loosing data issue during disaster, you also can have some optimization procedures
  • Daily full backups
  • Hourly (or even more often) differential backups.
  • Real time transactions log shipment to keep spare database up do date.
  • Database replication with different modes.

In case you have spare server ready, restoration time will be completely depend on size of database, as it will be the longest process.
Also you need to pay attention on switching of the license to new production instance after recovery. That is not complicated at all, but may take a vital time during disaster.

Failover Protection
Failover protection is realtime monitoring and automatic switching of resources with keeping application accessible all the time.
In Acumatica it is archivable thing configuring of Highly Availability Cluster.

With the cluster you have 2 or more servers (with Acumatica Instances) working in parallel. In case of any server goes down, other servers still can handle incoming requests. In that case system is fully protected from any hardware issues and does not require immediate disaster recovery.

In terms of cluster, you need to remember that database also may require clusterization, as database is shared resource withing cluster model.

Also you need to know that you need a separate license for each Acumatica instance in the cluster. But if you have license with size more than M (M, L, XL, XXL, .... ) you can split one license in to 2 smaller ones. For example you can split one M license into 2 S. XL license can be spited to 2 L.

Summary
In my experience many clients are fine with delay in the disaster recovery procedure and do not really need high availability cluster. Also that procedure may save a lot of money as cluster will be always more expensive.

Also remember that Acumatica SaaS deployment ensures 99.95% up-time and takes all responsibility on recovering and restoring system in case of disaster.

Have a stable up-time!

No comments: