Increase Your Data Flexibility with Explicit Bucket Histograms in Splunk Observability Cloud

By Teneil Lawrence

As your tech landscape expands, so does the need for visibility across your ecosystem. With each new service you develop to meet customer needs comes more applications, more servers and more performance data that your teams are generating, storing and analyzing to maintain visibility and reliability. In fact, Statista predicts that organizations’ data volumes will exceed 180 zettabytes by 2025. You could quickly find yourself generating millions of time series metrics (stored in Splunk Observability Cloud as a MTS) just to monitor the latency of web requests to your network of servers.

However, more data isn’t always better. Collecting and processing performance data across your entire ecosystem with a unique metric for each statistic is not always practical or financially feasible– especially for cloud-forward enterprises. Additionally, pre-calculating an aggregate statistic, such as a percentile, and sending that as a gauge metric to your Observability platform could also introduce new issues down the line with data flexibility when dashboarding and troubleshooting.

New in Splunk Observability Cloud: Histogram Data

Enter histograms. As defined by the OpenTelemetry Project, a histogram metric data point conveys a population of recorded measurements in a compressed format. Histograms provide customers with a more flexible way to use performance data in their charts and detectors without increasing costs or obscuring meaningful data. With the recent launch of Explicit Bucket Histograms in Splunk Observability Cloud, users now get native support for histograms as a metric type. They can seamlessly ingest, store and query histograms in the platform to efficiently capture distributions of measurements and perform statistical calculations like percentiles.

Explicit Bucket Histograms in Splunk Observability Cloud

Explicit bucket histograms divide MTS data into equally-sized intervals known as buckets based on boundaries which are defined during instrumentation. Each bucket tracks the frequency as well as the sum, maximum and minimum of all the observations within its boundaries, enabling statistical calculations such as percentiles. Instead of a single metric, customers can see the distribution of data points within each bucket to easily identify trends or patterns in their data.

Explicit bucket histograms are useful for performance data, such as request latency or response time. For example, a user tracking latency for server requests using explicit bucket histograms in Splunk Observability Cloud will be able to:

See the total number of requests within a period of time and the minimum, maximum and average latency
Calculate the percentage of requests that fell below a particular boundary and determine whether SLOs or SLAs are being met
Identify patterns across a distribution and accurately apply aggregations to histograms

Smarter Data Management, More Efficient Troubleshooting

With native support for histograms in Splunk Observability Cloud, engineering teams get the flexibility to build the visualizations and detectors they need to maintain service reliability and efficiently troubleshoot issues while controlling costs.

Reduced MTS, Lower Costs

Without histogram data support, engineering teams may need to run special infrastructure to pre-aggregate percentiles before sending in their performance data. This could prove costly at scale and require additional toil and reinstrumentation if teams later needed different percentiles for visibility or troubleshooting. Natively ingesting histograms means users can save on these instrumentation costs. Furthermore, by using a histogram to represent a population of multiple time series metrics, users can reduce their MTS volume and lower costs.

Greater Agility And Accuracy

Pre-computing gauge metrics to represent required percentiles often relied on guesswork to determine which percentiles might be most valuable and limited the types of calculations that could be performed after these metrics were ingested. Users weren’t able to aggregate these histograms, and performing additional calculations on these pre-aggregated percentiles could obscure meaningful insights and lead to inaccurate conclusions.

Histograms provide service owners and SREs with greater flexibility when creating charts and detectors in Splunk Observability Cloud. When defining the query for a new chart or detector, users are empowered with greater analytical capabilities. They maintain the ability to request any percentile, request a percentile across multiple services and request a percentile over a specific period of time. They have the flexibility to aggregate their data for charts and detectors in any way they want to better understand performance data and troubleshoot performance issues.

Getting Your Histogram Data Into Splunk Observability Cloud

There are two ways Splunk Observability Cloud users can start ingesting their histogram data today: with the Prometheus receiver in the OpenTelemetry Collector or with OpenTelemetry libraries. The Prometheus receiver will scrape Prometheus histograms to be sent into Splunk Observability Cloud. Many existing infrastructure components like Kubernetes and Istio already make histograms available for scraping. Otherwise, users can leverage OpenTelemetry libraries to instrument their code for all major programming languages to send in histograms.

Learn more about explicit bucket histograms in Splunk Observability Cloud, and sign up for a free trial to get started today!

Teneil Lawrence

Teneil Lawrence is a Senior Product Marketing Manager for Splunk Infrastructure Monitoring and related solutions. With more than nine years of experience in growth and B2B product marketing, Teneil is passionate about being the voice of the customer to bridge the gap between customer needs and product strategy. After work hours, Teneil enjoys binging all forms of creative content, cooking and eating, and volunteering.

About Splunk

The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.

Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.

Learn more about Splunk

Subscribe to our blog

Get the latest articles from Splunk straight to your inbox.

Connect with Splunk on X

Follow @Splunk

Connect with Splunk on Instagram