CloudWatch logs offers a great way of collecting all of your performance and operational logs from your AWS environment into one location. With this being a flexible platform, many sources of logs can be collected into multiple log groups, with each potentially having differing sources, and therefore different log formats. For example, VPC Flow logs, CloudTrail and RDS logs all have different log structures. This post explores how any log files from Cloudwatch can be ingested into Splunk regardless of the format, and where it is possible to extend/vary the example given for other use cases.
Splunk offers many ways of ingesting AWS data; via the AWS Add-On, serverless with Lambda or Kinesis Firehose, or even automated and serverless with Project Trumpet. However, Kinesis Firehose is the preferred option to be used with Cloudwatch Logs, as it allows log collection at scale, and with the flexibility of collecting from multiple AWS accounts.
One of the Firehose capabilities is the option of calling out to a Lambda function to do a transformation, or processing of the log content. This allows you to open up the message package picked up from the Cloudwatch logs to re-format this into a Splunk Event. This also opens up some additional flexibility to set the event information such as source, sourcetype, host and index based on the log content.
We’ve already seen one version of this transformation in action when Splunk and AWS released the Firehose integration. The lambda function used for that example extracts VPC Flow logs that can then be sent to Splunk. This function is available as an AWS Lambda blueprint - kinesis-firehose-cloudwatch-logs-processor or kinesis-firehose-cloudwatch-logs-processor-python. This blog takes a step further, providing a basis for a common log collection method into Splunk that can be used for ANY of your Cloudwatch logs.
As a recap, the architecture of how to ingest logs with Firehose is shown below:
Most of what is needed to setup Firehose and Splunk can be followed from this earlier blog. You will also need to refer to the setup process described here, noting the different steps to take after those listed within the mentioned blog, and adding a new Lambda Function.
In the previous blog, the Lambda function template extracts individual log events from the log stream and sends them unchanged to Firehose. As these are simple VPC Flow logs (not in JSON format), the content is easily sent as Raw events to Splunk. Some key things to note with the standard template:
With the new Lambda function, you can take the log from Cloudwatch and wrap it up as a Splunk HEC Event in JSON format. This adds the benefits of:
The example Function template does the following:
You can find further details of how to format the HEC Event here.
The example transforming function that we have shows how to format the events sent to Firehose into a Splunk HEC JSON format, setting some of the event details based on the Log information. Further changes to the function are possible to make it more flexible or fit your requirements. Simple changes allow you to be creative with how you set the Splunk Index, host, source and sourcetypes. Here’s two examples of this:
1) If you had RDS instances sending their logs into CloudWatch, you could use the Log Group name so that one Firehose can be used for multiple RDS instances and log types – for example if there were 2 RDS instances with their logs going into Log Groups of
/aws/rds/mydatabse1/audit
/aws/rds/mydatabse1/error
/aws/rds/mydatabse2/audit
/aws/rds/mydatabse2/error
As the audit and error logs are need different sourcetypes, it would be easy to set the sourcetype value based on whether it is an audit or error log. In this case, you could also add additional Log Groups from other database logs by simply adding subscription filters to the same Firehose, and not having to change anything on the Splunk side.
2) Another example could be where you are collecting logs from multiple AWS accounts, but for security reasons we may wish to store these logs in separate Splunk Indexes. As you can set the index value in the transform function, the function could set different index names for the accounts.
Happy Splunking!
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.