The HTTP Event Collector (HEC) is the perfect way to send data to Splunk, at scale, without a forwarder. If you’re a developer looking to push logs into Splunk over HTTP or you have an IOT use case then the HEC is for you. We cover multiple deployment scenarios in our docs. I want to focus on a single piece of the following distributed deployment for high availability, throughput and scale; the load balancer.
You can use any load balancer in front of the HEC but this article focuses on using Nginx to distribute the load. I’m also going to focus on using HTTPS as I’m assuming you care about security of your data in-flight.
You’re going to need to build or install a version of Nginx that enables HTTPS support for an HTTP server.
./configure --with-http_ssl_module
If you install from source and don’t change the prefix then you’ll have everything installed in /usr/local/nginx. The rest of the article will assume this is the install path for Nginx.
Once you’ve got Nginx installed you’re going to need to configure a few key items. First is the SSL certificate. If you’re using the default certificate that ships with Splunk then you’ll need to copy $SPLUNK_HOME/etc/auth/server.pem and place that on your load balancer. I’d highly encourage you to generate your own SSL certificate and use this in place of the default certificate. You can also secure the HTTP Event Collector with your own signed certificate in global settings of the [http] stanza of inputs.conf.
The following configuration assumes you’ve copied server.pem to /usr/local/nginx/conf.
server { # Enable SSL for default HEC port 8088 listen 8088 ssl; # Configure Default Splunk Certificate. # Private key is included in server.pem so use it in both settings. ssl_certificate server.pem; ssl_certificate_key server.pem; location / { # HEC supports HTTP Keepalive so let's use it # Default is HTTP/1, keepalive is only enabled in HTTP/1.1 proxy_http_version 1.1; # Remove the Connection header if the client sends it, # it could be "close" to close a keepalive connection proxy_set_header Connection ""; # Proxy requests to HEC proxy_pass https://hec; } }
Next we’ll configure the upstream servers. This is the group of servers that are running the HTTP Event Collector and auto load balancing data to your indexers. Please note that you must use a heavy forwarder as the HEC does not run on a Universal Forwarder.
upstream hec { # Our web server, listening for SSL traffic # Note the web server will expect traffic # at this xip.io "domain", just for our # example here keepalive 32; server splunk1:8088; server splunk2:8088; }
Now let’s put it all together in a working nginx.conf
# Tune this depending on your resources # See the Nginx docs worker_processes auto; events { # Tune this depending on your resources # See the Nginx docs worker_connections 1024; } http { upstream hec { # Our web server, listening for SSL traffic # Note the web server will expect traffic # at this xip.io "domain", just for our # example here keepalive 32; server splunk1:8088; server splunk2:8088; } server { # Enable SSL for default HEC port 8088 listen 8088 ssl; # Configure Default Splunk Certificate. # Private key is included in server.pem so use it in both settings. ssl_certificate server.pem; ssl_certificate_key server.pem; location / { # HEC supports HTTP Keepalive so let's use it # Default is HTTP/1, keepalive is only enabled in HTTP/1.1 proxy_http_version 1.1; # Remove the Connection header if the client sends it, # it could be "close" to close a keepalive connection proxy_set_header Connection ""; # Proxy requests to HEC proxy_pass https://hec; } } }
When you start Nginx you will be prompted to enter the PEM passphrase for the SSL certificate. The password for the default Splunk SSL certificate is password.
To test your setup you can
There are a bunch of settings you may want to tweak including HTTPS Server Optimization, load balancing method, session persistence, weighted load balancing and health checks.
I’ll leave those settings for you to research and implement as I’m not an expert on them all and everyone’s deployment will differ in complexity and underlying resources.
Hopefully this gives you the foundation for a reliable load balancer for your distributed HTTP Event Collector deployment.
----------------------------------------------------
Thanks!
Scott Haskell
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.