Customers love us for our out-of-the-box integrations and built-in dashboards for services like ActiveMQ. Underneath our no-fuss solutions, SignalFx runs the most powerful monitoring service for modern applications.
One or our customers recently ran into an ActiveMQ problem that they couldn’t pin down without SignalFx. In their version of ActiveMQ, messages sometimes get “stuck” in the queue, and message consumers won’t pick them up even if they have available capacity. This bug causes messages to never be delivered.
Every time they encountered this condition, it meant a critical issue because every “stuck” message was a client request going unfulfilled. Other monitoring tools weren’t able to detect this condition when it happened, because they couldn’t provide visibility into the messages that never make it out of the queue. There was no way for to understand whether messages were stuck in the queue, nor for how long. This issue became a problem for their own customers and business.
They solved this problem by writing a tool that inspects each enqueued message, calculates the average and maximum age of messages each queue, and then reports those metrics to SignalFx using our Java client library. They’ve let us share this project with everyone in the wider community. You can find it on GitHub here.
We’ve made a built-in dashboard to display the metrics produced to make it easy to see if there are any queues with messages that have been waiting… and waiting… and waiting…
Using these metrics from inside each message queue, we can create intelligent detectors that alert when there’s a message stuck in the queue and unable to be delivered. For example, you can create detector that fires when the oldest message in the queue has been getting older for at least 5 minutes. To build this, we use the analytics function “Rate of change," which lets us know how quickly a metric is changing.
In this example, rate of change tells us how much older the oldest message in each queue is getting each time we measure it. When this function is greater than 0, it tells us that a message is sitting in the queue, aging. If this continues for a long time, it could indicate that one or more messages is stuck and not being picked up.
Now our customer has detailed information about the messages in their ActiveMQ queues and can monitor the conditions that really matter to their business.
When you need visibility into metrics that matter, get it done with SignalFx. Start a free trial to begin monitoring with SignalFx today!
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.