Microsoft Azure is an enterprise-class cloud computing infrastructure that provides robust IT resources for modern business applications.  Although Azure SLA promises the supreme quality of services, Azure administrators play their part in ensuring the optimal performance of their applications. So as an Azure administrator, how would you evaluate the performance of your Microsoft Azure infrastructure? Below we cover the 5 Azure performance metrics that will help you evaluate and measure the performance of your Azure infrastructure.

What Azure Performance Metrics Should Be Monitored?

#1- Azure Availability and Response Rate

avaiabilityPerhaps the most critical metric in a cloud-hosted environment is its availability and response rate.  A great availability rate guarantees that your business applications and systems on Azure receive 24/7/365 uptime. Azure Administrators key job description is to routinely monitor the availability and service uptime of Azure resources- in whole and at a granular level.  Azure Administrators must measure the availability of:

Besides above, the Azure Administrator must also measure and evaluate the time Azure Infrastructure and individual resources takes to respond to system and users requests.  Identifying and optimizing less responsive resources will help in delivering a much quicker and responsive Azure experience.

#2- Network Capacity and Utilization Metrics

networ Networks are the communication backbone of cloud-based distributed computing environments. They transfer instructions, queries, and messages across clusters of servers and a multitude of cloud services.  Similarly, in an Azure Cloud Infrastructure, network performance plays a pivotal role in interconnecting the Azure Cloud Platform and delivering those too global end users. The key factors that determine network performance in an Azure Cloud Infrastructure are:

  • Network Availability – the overall percentage of time network resource and services were available
  • Network Throughput – the available bandwidth and the amount of data that is coming in and out of the network
  • Network Responsiveness – how quick the network is in responding to network requests?
  • Network Utilization – what is the utilization level of the Azure backend network? Is it moderate, under or over utilized?
  • Network Capacity – how scalable the current network resources are before they reach the threshold limit?

#3- Storage Capacity and Scalability

storage The Azure hosted storage infrastructure encompasses a broad range of storage and data management services. They include the raw storage capacity (Azure Storage) to web scalable databases (SQL Azure).  As an Azure Administrator you must measure:

  • Storage Availability – the percent of time data storage and processing services are available
  • Storage Utilization – the minimum, maximum and average amount of storage and database services used
  • Storage Scalability: the ability of the Azure storage and database infrastructure to scale with burgeoning workloads

#4- Processing Capacity and Scalability

server Servers and raw computing instances are where a bulk of the processing is performed. Their processing power and capacity is one of the most essential elements in the performance of Azure-hosted applications and services. To ensure consistent and reliable processing resources, Azure Administrators must measure:

  • Server Availability – availability of Windows and virtual machine servers over a given period of time. The higher the percentage the better it is
  • Server Utilization – the extent to which the maximum resources of the server is utilized in a given timeframe. This includes the processing power, memory, and I/O operations
  • Server Responsiveness – how quick the servers respond to internal or external requests?
  • Servers Capacity – what is the current level of utilization and the available remaining capacity?
  • Server Scalability – the extent to which a server or clusters of servers can be scaled with additional workload.

#5- Azure Recovery

recovery
Though how reliable and optimized your Azure Infrastructure is, system downtimes and performance bottlenecks are inevitable. Even the best of the cloud infrastructure can face downtime.  Though one cannot guarantee 100% availability all the time, Azure Recovery metrics define how efficient the infrastructure is reinstating and resuming system-wide operations after a failure. Some of the key performance metrics for Azure Recovery are:

  • Switchover Time – the time it takes to switch over from a failed node/ server to an alternate instance
  • System Recovery Time – the minimum expected time for an azure resource to recover and resume its operations from the makeshift resource to the original resource

Microsoft Azure Performance Monitoring Software

Azure Portal provides basic monitoring for Azure Web and Worker Roles. Users that require advanced monitoring, auto-scaling or self-healing features for their cloud role instances, should learn more about CloudMonix.  Along with advanced features designed to keep Cloud Services stable, CloudMonix also provides powerful dashboards, historical reporting, various integrations to popular ITSM and other IT tools and much more.

Bonus Tip: see the detailed comparison of CloudMonix vs the native Azure monitoring features.

Suggested reading