Recently, I was met with a Splunk use case where someone wanted to onboard Github data directly into Splunk Cloud. The type of data they were looking for was audit data around the repository itself. So to be clear, this wasn’t a matter of ingesting a .csv file hosted in a Github repository; this was the need to answer questions like “How can I create a line chart of # of pushes throughout the week?” or "How can I see how many open issues I have in my repository from inside of Splunk?"
The main challenge with this specific use case (and what makes it so interesting in my opinion) was the customer was running Splunk Cloud, and they had no interest in running any kind of on-prem architecture. They were all-in on cloud. So what does one do in a situation like this? And is it sensible to want to be 100% in the cloud? I think so.
Many people may want to tackle this problem by using something like the Github Add-on for Splunk and a Splunk Heavy Forwarder. While that’s certainly not an incorrect method for collecting this data, it does require some architecture to run Splunk on to achieve it. Of course, EC2 is an option when it comes to cloud technology. In my opinion though, I think the same thing can be achieved using a more lightweight serverless functionality — and I’m here to show you a quick way of how you can achieve this.
For starters: enter the AWS Webhook to Splunk HTTP Event Collector Serverless Function. This is a fairly basic blueprint of a function I created that you can spin up today with the click of a button. The goal here is to deploy a lightweight AWS Lambda function that acts as a sort of translator between Webhooks and the Splunk HTTP Event Collector.
As I mentioned, deploying this serverless function is extremely simple. You can start out by locating the serverless function on the AWS Serverless Repository. Feel free to name the function anything you’d like. In my case, I wanted to collect Github data from my corona_virus repository, so I titled it accordingly as you’ll see below.
Setting up your own private endpoint, is as easy as clicking a button.
In order to use GIthub Webhooks to send your data to Splunk, you’ll need to create a Splunk HTTP Event Collector Token. All of the information on setting one of these up is located in our Splunk Docs page. The setup time for one should generally be less than 5 minutes. Once you are done with this step, you should have all of the following information:
In the example GIF below, you’ll notice that once my serverless function is completely deployed, I can click the “Test app” button to get a URL for my endpoint which will be used for as part of my Webhook URL. This URL can be used in combination with the information from step 3 (and is documented in the serverless repository README).
https://<your_api_gateway_url>/Prod/webhook-to-hec?url=your.server.com&port=8088&http_method=https&token=223342-23242-232324
Last but not least, let’s connect Github to the newly created serverless function. Simply visit your Github repository of choice and visit Settings > Webhooks. From there you can select “Add Webhook” and enter all of the settings below:
Finally, click Add Webhook to confirm. You’ll see the example gif below of what this process looks like. A green “check mark” next to the URL when you’re done means the process was successful.
And that’s how easy it can be to get data in from one Enterprise Cloud solution to another. I guess I better go solve some of these open issues now.
If you have any questions or comments, please don’t hesitate to reach out! Please also feel free to deploy this code, and then modify it to your own liking. This is meant to get users up and running for a specific use case, but it will hopefully also be adapted to many more different use cases in the future.
----------------------------------------------------
Thanks!
Ryan O'Connor
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.