Platform

December 11, 2023

4 Minute Read

Splunk Edge Processor and Federated Search: Do I Need It?

By Joseph Kandatilparambil

In today's data-driven landscape, organizations are confronted with an overwhelming volume of data, which is often accompanied by budgetary constraints. To address these challenges, a thoughtful data tiering strategy is crucial. This can be done by developing the practice of:

Understanding and ranking datasets based on how critical these datasets are to the least used dataset.
Storing these datasets across platforms that have the right balance of cost and performance.

After which, powerful data management and federated search capabilities become imperative: with these, you’ll have the flexibility to access data sets across different platforms — and correlate them when you need based on the use case at hand.

Simplifying Data Management and Reach With Splunk

Our goal at Splunk is to make data management and accessibility easy and flexible for our customers — so you can gain value out of your voluminous data more efficiently. To that end, we’ve made a couple big announcements this year:

Launch of Splunk Edge Processor
General availability of Federated Search for Amazon S3.

Splunk Edge Processor is a service offering deployed at the edge with a data control plane accessible from Splunk Cloud Platform. It is designed to help customers achieve greater efficiencies in data transformation close to the data source, data placement and improved visibility into data in motion. With Edge Processor, customers can filter, transform, and route data from the edge into Splunk indexes or Amazon S3 buckets.

Federated Search for Amazon S3, on the other hand, is a new capability that allows customers to search data from their Amazon S3 buckets directly from Splunk Cloud Platform without the need to ingest it into Splunk.

In this blog, we will dive into how Splunk Edge Processor and Federated Search for Amazon S3 can help build and implement data strategies to efficiently maximize the value derived from your data.

Edge Processor Streamlines Data Management

When addressing data transformation, Splunk Edge Processor is designed to extract only the critical data, employing data reduction techniques to streamline data ingestion into Splunk indexes.

Capturing and cleaning data at close proximity to the source is crucial especially when it comes to sensitive data sets that cannot leave the organization's network boundaries. This way organizations can ensure that only the essential and clean data gets ingested into Splunk. Any extraneous data? You can store that in an external data storage like Amazon S3.

Now, let's look at how you can implement these policies on edge processors.

In addition to the two major announcements, Splunk also announced an updated version of Splunk’s search language SPL2. SPL2 caters to users with diverse query language backgrounds, seamlessly blending SPL and SQL syntax for familiarity. Unleashing an array of robust features, including built-in functions, ability to create custom functions and custom data types, comment integration and many more. SPL2 sets a new standard for concise and powerful data queries.

Now imagine this: anything that can be implemented in SPL2 can be implemented in Edge Processor! That means that any task you implement using SPL2 can be part of your Edge Processor pipelines, including:

Filtering verbose logs
Extracting critical portions of the logs
Masking sensitive fields
Applying transformation commands

All this to say: you can now build data pipelines specific to your organization’s needs.

Today, Splunk Edge Processor can receive data from many different sources like Universal Forwarders, HTTP Event Collector, syslog and many more; and route data to destinations including Splunk Cloud Platform, Splunk Enterprise and Amazon S3. Check out the full list of supported sources and destinations.

Splunk Federated Search For Amazon S3 Transforms Data Exploration

In recent years, AWS S3 has become the most popular storage platform for various different use cases because of its ease of use and storage capabilities. It is used for storing data for various different use cases. It could be your web applications writing data to S3, storing analytical data, storing data for compliance/long term retention and many more.

Now with Splunk Federated Search for Amazon S3 you can make these data sets available to Splunk — which means you can use Splunk’s powerful search language to explore them and correlate these data sets with data in Splunk. Yes, this includes data that an Edge Processor sends to Amazon S3.

And an added benefit that Edge Processor provides: data written by Edge Processor is partitioned by time and stored in JSON format in Amazon S3. This enables Splunk Federated Search for S3 to work with the dataset efficiently.

Federated Search for Amazon S3 works by seamlessly integrating Splunk with AWS Glue Data Catalog which provides the necessary schema and metadata for Splunk Cloud Platform to interpret compatible datasets from Amazon S3. This collaboration allows Splunk to effectively search various data formats such as JSON, CSV, Parquet, ORC, compressed files like bzip, gzip, and many more.

This integration enhances the search capabilities for Splunk users, providing a comprehensive and streamlined data exploration experience.

What Next?

Now that we have learned how Edge Processor and Federated S3 works together to simplify data management and reach, let's see this in action. Here’s a video of how Buttercup Enterprises, a fictional gaming company, is looking into using Splunk’s Edge Processor and Federated-S3 to solve their data engineering problems.

While the possibilities of what customers could leverage Federated Search and the Edge Processor for are unlimited, this blog is an attempt to give a primer on how to leverage these two features and open up ideas on how they can be leveraged for a specific challenge in your organization.

Get Edge Processor Today

If you are a current Splunk Cloud Platform customer hosted in the US, EMEA (Dublin, Frankfurt, Germany), UK (London), or APAC (Tokyo, Japan and Singapore) Splunk Cloud regions, you can get access to Edge Processor today. Contact your Splunk sales representative, or send an email to EdgeProcessor@splunk.com with your company name, Splunk cloud stack name, and Splunk Cloud region. If you are a Splunk Cloud Platform customer hosted in other Splunk Cloud regions, also contact your Splunk sales representative or send an email to get on the list to be enabled once Edge Processor is available in your region.

For more about Edge Processor, including release plans to support additional sources, destinations, and new functionality, see release notes and documentation.

This blog was co-authored by Joseph Kandatilparambil, Principal Technical Marketing Engineer and Raja Tamilarasan, Senior Sales Engineer at Splunk.

Joseph Kandatilparambil

Joseph Kandatilparambil is a Technical Marketing Engineer at Splunk focusing on enhancing customer experiences in domains like cloud solutions, data analytics, data lakes and data storage. His work involves modernizing analytics and developing hybrid cloud strategies. Outside of work Joseph enjoys hiking, rock climbing, kite-surfing and cooking.

Platform 3 Min Read

Announcing new AWS Lambda Blueprints for Splunk

Splunk and Amazon Web Services introduce new Lambda Blueprints. Makes it easy to stream valuable logs, events and alerts from over 15 AWS services.

Platform 4 Min Read

From Setup to Migration: Azure Event Hubs in Data Manager

Seamlessly integrate Azure Event Hubs with Splunk Data Manager and enhance your data management with efficient, secure, and scalable solutions.

Platform 5 Min Read

Get to Know Splunk Machine Learning Environment (SMLE)

An introduction to SMLE Labs and a showcase of the various ML capabilities at a high level by walking you through the environment, step-by-step.

About Splunk

The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.

Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.

Learn more about Splunk

Subscribe to our blog

Get the latest articles from Splunk straight to your inbox.

Connect with Splunk on X

Follow @Splunk

Connect with Splunk on Instagram