Azure Service Bus queues are often used to decouple parts of connected systems and to offload heavy-duty processing to background tasks. Loose coupling improves the overall reliability and performance of the system.  However, with a level of abstraction offered by the Service Bus, it is harder to notice when message processing fails or doesn’t keep up with the demand.

CloudMonix provides a number of ways to verify that messages are processed successfully and in a timely manner.

In this article we’ll explain what metrics should be monitored in systems using Azure Service Bus queues and how they can be configured in CloudMonix:

1. Run CloudMonix Setup Wizard to connect to your Azure environment

If you aren’t using CloudMonix yet, sign up for a free account, then authorize CloudMonix to view your Azure subscription with Azure Service Bus. Learn more about the setup process here.

2. Ensure messages are successfully delivered

When message processing services fail to process a message for a number of times, Azure Service Bus can move that message to a dead-letter queue.  This ensures unprocessable messages do not get stuck.

It is important to monitor dead-letter queues to understand if any of messages can’t be processed.  CloudMonix by default offers metrics and alerts that will monitor deadletters.  Default alert will trigger if it detects any dead-letters for any of the queues in the monitored Service Bus namespace.

To familiarize yourself with how dead-letter queues are monitored:

  • In the configuration dialog for Azure Service Bus navigate to the “Metrics” tab.
  • Verify that “Queues” metric exists and is enabled.
  • Navigate to the “Alerts” tab.
  • Ensure that “Deadletters Detected” alert exists and is enabled.

The “Deadletters Detected” alert is triggered when CloudMonix detects any dead-lettered messages or are already present in the dead-letter queue. This alert sends a notification when any of the dead-letter queues in the monitored namespace have messages.

It may be also important to receive individual notifications when specific queues have dead-letter messages.  Using CloudMonix users can also track and alert on dead-letters for a specific queue:

  • In the configuration dialog for Azure Service Bus navigate to the “Metrics” tab.
  • Define a new metric of type “AzureServiceBusDeadLetterMessageCount”. Select a queue that should be monitored from the “Queue” drop-down (screenshot below).
  • Navigate to the “Alerts” tab.
  • Define a new alert that’s checking the value of the new metric. An alert will be raised when any dead-letters are detected, i.e. when the value of the new metric is greater than zero (screenshot below).


3. Ensure that messages are being picked up

It is also important to validate that message processors are working and are actually picking up messages.  One way to do this is to alert when the message ID of the next-to-be-processed message does not change over time, which indicates the same message is stuck in the queue for a long period of time.

To alert when message id of the oldest message does not change over time:

  • In the ASB configuration dialog, navigate to the “Metrics” tab.
  • Define a new metric that will track ID of the currently oldest message. The metric should be of type “AzureServiceBusOldestMessageId” and select a queue to track (screenshot below).
  • Define a new metric that will track ID of the previous oldest message. The metric should be of type “AggregateMetric”, use currently oldest message id as the source metric, use “PreviousValue” aggregation method over a time window. Important: aggregation period should be over 5 minutes (screenshot below).
  • Navigate to the “Alerts” tab.
  • Define a new alert checking if IDs of the current and previous oldest messages are identical. Specify a meaningful sustained period, so processors have enough time to process messages successfully (screenshot below).

 

 

4. Ensure that messages are being processed in a timely manner

Sometimes messages are being picked up, but the system may still be impacted if services that are pulling messages from queues or topic-subscriptions can’t keep up with the incoming messages rate. It’s a good idea to monitor queue lengths and oldest message age, and alert when they are above the expected thresholds.

The easiest way to monitor queues lengths, assuming the acceptable length is the same for all queues, is using the “Queues” metric provided in the default template. Alternatively, you can track the length of an individual queue using “AzureServiceBusActiveMessageCount” metric type. It’ll be more practical if various queues have different acceptable length thresholds.

To monitor all queues lengths:

  • In the ASB configuration dialog, navigate to the “Metrics” tab.
  • Verify that “Queues” metric exists and is enabled.
  • Navigate to the “Alerts” tab.
  • Define a new alert checking if the number of “ActiveMessages” in any queue doesn’t exceed the expected threshold, e.g. 100. Specify a meaningful sustained period that will ensure temporary glitches won’t trigger the alert, e.g. 5 or 10 mins (screenshot below).


To monitor the length of a particular queue:

  • In the ASB configuration dialog, navigate to the “Metrics” tab.
  • Define a new metric of type “AzureServiceBusOldestMessageAgeInMinutes” and specify the queue to monitor (screenshot below).
  • If necessary, specify metrics for other queues in the same way.
  • Navigate to the “Alerts” tab.
  • Define a new alert checking if the previously defined metric doesn’t exceed the expected threshold, e.g. 10 min. Specify a meaningful sustained period that will ensure temporary glitches won’t trigger the alert, e.g. 10 mins (screenshot below).

In this article, we explained how to ensure that messages in Azure Service Bus queues are being processed successfully in a timely manner.

Apart from monitoring and automatic notifications, CloudMonix also offers powerful automation. For example, you can auto-scale Cloud Services and Web Apps based on ASB queues lengths. Learn more here.