Microsoft Azure released the preview for storage accounts failover for customers with geo-redundant storage (GRS). There has always been a need to be able to determine when a storage account write access is required and the secondary replication state is understood. This functionality allows taking advantage of controlling when to failover from the primary region to the secondary region for the storage account.

In particular, if the primary region for a geo-redundant storage account becomes unavailable for some reason, it is now possible to complete an account failover. Whenever a failover is performed, all data in the storage account fails over to the secondary region making it the new primary region. All the DNS records for all storage endpoints get updated to point to the new primary region. As soon as the failover is complete it is possible to automatically begin writing data using the endpoints in the new primary region.

The below diagram displays a normal failover workflow. That is when a client writes data to geo-redundant storage (GRS or RA-GRS) in the primary region and the data is asynchronously replicated to the secondary region. If the write requests fail within some time one can trigger the failover.

Failover workflow

Upon completion of the failover, write operations are resumed using the new primary service endpoints. After the failover, the storage is configured as locally redundant (LRS). Thus the account needs to be re-configured as geo-redundant storage (RA-GRS or GRS) to resume replication to the new secondary region. Converting a locally-redundant (LRS) account to RA-GRS or GRS incurs a cost.

How to get started with an account failover

All new and existing Azure Resource Manager storage accounts that are configured for RA-GRS and GRS support the preview of account failover. Storage accounts supported are general-purpose v1 (GPv1), general-purpose v2 (GPv2), or Blob Storage accounts in US-West 2 and US-West Central.

Account failover can be initiated from the Azure portalAzure PowerShellAzure CLI, or the Azure Storage Resource Provider API. The below screenshot shows one-step initiation of an account failover from within the Azure portal.

Failover from Azure portal

 

Being currently in preview, account failover should not be used with production workloads as there is no production SLA available until this feature becomes generally available.

Important note: account failover often results in some data loss due to the geo-replication latency involved. While the secondary endpoint is usually behind the primary endpoint, any data that has not yet been replicated to the secondary region will be lost after the failover.

It is recommended to check the Last Sync Time property before initiating a failover to evaluate how far the secondary endpoint is behind the primary. Learn more about the account failover features and implications in the documentation “What to do if an Azure Storage outage occurs”.

Pro Tip: Get deep and immediate insight into the stability of all of your Azure resources.