Why is understanding small buckets important? Bucket health is important to monitor because it can adversely impact Splunk search performance. Unhealthy bucket growth — especially the asymmetric creation of small vs. large sized buckets — can lead to slower or paused searches by requiring each search to read more (TSIDX) files and perform more disk I/O. This leads to slower or paused searches, and at its worst can cause search and indexing services to become unavailable to users. With limited resources left available, indexing queues can become blocked or full, resulting in data latency that impacts alerting and other time critical searches.
Splunk Enterprise stores indexed data in buckets, which are directories containing both the data and index files into the data. An index typically consists of many buckets, organized by age of the data. To learn more about buckets, read the Splunk Docs here. New bucket creation is a normal part of Splunk internal operations — as the volume of indexed data grows, so do buckets. New buckets can also be created from routine system tasks such as indexer cluster restarts, instance shutdowns and last recently used cache eviction.
Small buckets, or buckets that were rolled prematurely before reaching their maximum configured size, directly impact search performance. The more buckets a search needs to read, the more resources a search requires to complete. Thus, a telltale sign of unhealthy bucket growth is the presence of small-sized buckets.
In most cases, the presence of very small buckets are indicative of data issues, particularly timestamp mismatches. When the events coming into an index are outside of a allowed time span for a bucket, Splunk Enterprise will create a new bucket. For example, the following situations can lead to buckets rolling prematurely:
When timestamps vary, buckets capture fewer events before they end up getting rolled. This is because Splunk limits the number of hot buckets that are open at any time point in time and timestamp mismatches cause more hot buckets to be created and rolled.
The key questions to ask to determine if small buckets are impacting your deployment are:
Here are some searches you can run to better understand the distribution and presence of small buckets in your deployment:
On each Cluster Manager, just to understand whether bucketing is behaving evenly on IDXers (recommended time range: 7 days):
| rest splunk_server=local /services/cluster/master/peers | rename
label AS peer_name | stats sum(bucket_count) AS bucket_count by
peer_name | sort - bucket_count
On each Search Head, to understand whether buckets being rolled are too small (recommended time range: 1 day & 7 days):
index=_internal source=*/splunkd.log* hotbucketroller | stats count by
caller | sort - count
If these exploratory searches determine the presence of too many small buckets in your deployment, you should investigate your data ingestion rules to prevent the problem from happening again in the future. As always, reach out to the Splunk community on Splunk Answers and join an upcoming user group to ask any additional questions about running a high-performing Splunk Enterprise deployment.
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.