In the last installment of this blog series, we discussed the basic setup of Splunk Workload Management and how to allocate resources for ingestion and search workloads. In this post, I'll describe workload pool settings for a complex deployment and how to achieve more predictable search execution.
Most production Splunk deployments are complex. Users search data in single or multiple indexer clusters through many search heads or search head clusters. As a result, configuring workload pools on the search tier (search heads) and indexer tier (indexers) needs to be planned well. When workload management is enabled, a search is placed in a workload pool on search a tier based on the workload rules defined on that search head. On the indexer tier, the search is placed in the same pool as the search head tier. If that same pool does not exist on the indexer tier, the search is placed in the default search pool. On indexers, workload pools are defined in the workload_pools.conf file that is pushed to all indexers using Cluster Master.
The example below shows an indexer cluster shared with a search head cluster and a standalone search head. The search head cluster runs user searches and has two search pools — high priority and standard. The standalone search head is used exclusively by our Splunk Enterprise Security (ES) application.
Example Deployment Scenario
Based on the configuration of workload pools, the searches from the Splunk ES application will be placed in the SearchES pool on the search head and the same pool that exists on the indexers. Similarly, the high priority searches assigned to the HPSearch pool on the search head cluster will run in the same pool on the indexers. By changing the resource allocation of the SearchES pool on the indexer cluster, you balance the performance requirements for Splunk ES and user searches run on the search head cluster. You may further divide the SearchES pool into subpools to allocate resources for DMA and other ES searches. Keep an eye out for more on this topic in a future blog!
Predictable execution of high priority searches is a key requirement for many customers. High priority searches could be scheduled searches from a security operations team or data model acceleration for the Splunk Enterprise Search application, etc. Search execution depends on many factors and resource availability is one of the key factors. In a resource contention scenario, other (non-priority) workloads may starve the high priority searches impacting both their execution time and predictability.
Creating a ‘reserved’ pool with sufficient resources and placing high priority searches in that pool allows for more predictable execution even when the system is busy. Of course, the key here is to create a large enough resource pool relative to the number of searches that will be placed in that pool. Overcrowding the ‘reserved’ pool with many searches will defeat the purpose.
The example below illustrates the impact of workload management on high priority searches. The system consists of 3 indexers and a search head with the resource allocation as shown below. The HPSearch pool is configured with 15% CPU allocation in this example, but generally, you will give more resources to this pool (even as high as 70-80%). If unused, the CPU cycles will be shared with workloads allocated to other pools.
The results show the execution time (seconds) for searches, S1 to S9. The blue graph shows the execution time without workload management enabled. The red graph shows the execution time with workload management enabled, and S2 placed in the HPSearch pool while all other searches placed in the StdSearch pool. The comparison shows ~41% reduction in search execution time for S2 when placed in the HPSearch pool. There was a minor increase in search execution time of other searches. It is important to understand that Workload Management does not increase your overall system resources. It allocates resources to higher priority workload as defined by the admin.
Workload Management provides a powerful and flexible tool to focus Splunk resources on your most important workloads. Especially, in complex deployments, Workload Management can be used to ensure resource isolation and allocation between workloads or teams or applications.
Keep an eye out for our next blog in which I will describe the new Workload Management capabilities launching in an upcoming product release.
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.