If you're reading this, you're probably wondering how to get data from various Microsoft Azure services into Splunk. With the growing list of Azure services and various data access methods, it can be a little cloudy (pun intended) on what data is available and how to get all that data into Splunk.
In this blog post, I'm going go over how Microsoft makes Azure data available, how to access the data, and out-of-the-box Splunk Add-Ons that can consume this data. So let's dive right in.
There are 3 main ways Microsoft makes Azure data available.
This was the standard back in the day when Azure was introduced. Basically, Microsoft will dump data from a service into a separate storage location (called a storage account). For example, if you want Virtual Machine event logs, Azure will dump those into a storage account you specify. Since storage accounts are a separate service than a VM, the data about the VM will live on even after you delete the VM. Storage accounts have their own security and retention mechanisms, but we won't get too much into the weeds here. Just know that a source service could be configured to dump data into a separate storage account for retrieval.
Talking about standards, Event Hubs are the new standard for most Azure services. I like to think of Event Hubs as a scalable, relatively short-term, message bus. What I mean by this is Azure can dump data onto an Event Hub (via a service called Azure Monitor). This is similar to the storage account methodology mentioned above. However, data that goes onto an Event Hub is meant to be retrieved by something else. In fact, Event Hubs have a pretty short retention time for events (typically 24 hours to 7 days). Event Hubs can also scale up or down depending on the load necessary for receiving or delivering data. Hint: if the terms Pub/Sub, Kafka, producer and consumer mean anything to you, think in those terms. If not, forget that last sentence or just Google (or Bing) those terms if you want to dive a little deeper.
The third major way Microsoft makes Azure data available is REST APIs, and there are a lot of them. In the context of Splunk, you're typically looking for the "List" operations. For example, here are all the operations for Azure VMs. The Microsoft Azure Add-on for Splunk (more about that add-on in a bit) uses the "List All" operation to, well, get a list of all the VMs you have in Azure. You can use this information as entities in Splunk IT Service Intelligence (ITSI), Splunk Enterprise Security, or correlate it with other data sources in Splunk.
Now that you know the 3 main ways Microsoft makes Azure data available, let’s talk a bit about what data is available. There is no way I could create a comprehensive static list of all the data sources, so I'll stick to some popular Splunk-centric sources.
So now that you know how Microsoft makes Azure data available and some different types of data available, how do you go about getting that data in Splunk? The simple answer is add-ons. The two main add-ons used are the Splunk Add-on for Microsoft Cloud Services, and the Microsoft Azure Add-on for Splunk.
Did you notice the [Storage account], [Event Hub], and [REST] tags above? Those tags are going to help us decide which add-on to use. Here we go.
Did you notice a pattern there? The Splunk Add-on for Microsoft Cloud Services integrates with Event Hubs, storage accounts, and the activity log. The Microsoft Azure Add-on for Splunk integrates with various REST APIs. Notice that the Splunk Add-on for Microsoft Cloud Services can get the activity log via the REST API or Event Hub. It's the same data either way.
They say a picture is worth a thousand words, so this Sankey diagram will help visualize all those words…
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.