Security

January 17, 2019

4 Minute Read

| datamodel Endpoint

By Splunk

Very non-scientific research recently revealed that discussing the nuances of the Splunk Common Information Model (CIM) is more effective than Ambien at curing Splunkers' insomnia. In fact, we once caused marked drowsiness across the entire state of Rhode Island when we accidentally broadcast an internal podcast on the subject over the local NPR affiliate.

That said, with the recent release of CIM 4.12, we packed in a number of exciting changes. My favorite is definitely the addition of an “Endpoint” data model. So, be you a threat hunter, an Endpoint Detection and Response (EDR) fan, or just curious about why we made any changes at all, this is the blog post for you. Make yourself a strong espresso, and onward, dear reader!

Why It Matters

Before getting into the technical details, I want to take a moment to discuss why we did it and why it matters. The main driver was, of course, our customers. By providing a data model that works for modern endpoint solutions, we can provide more leverage for better detection and to enable hunt capabilities in customer environments. The second driver is our unquenchable thirst for innovation. As we see technologies become more widely adopted by security teams big and small, we provide tools to augment those technologies. Data models are one of many arrows in the Splunk quiver, and in this particular case it just made sense to do as we’ve done with other technology domains.

What We Did

As far as what we actually did around endpoint data, the idea was to take two data models that have historically been used for “endpoint-like” data (the Application State and Change Analysis models) and combine them, then add some of the extra goodness that EDR solutions provide: parent/child process relationships, process hashes, integrity levels, etc. By combining these datasets together, we were able make the process of mapping raw events from EDR solutions to a Splunk data model more straightforward, and ultimately provide a way to abstract away the specific technology from the data and use-cases.

For those of you who have read the excellent documentation (thanks, Loria!), you'll also notice there are actually five different datasets in this model, all powered by separate base searches. Long story short, we discovered in our testing that accelerating five separate base searches is more performant than accelerating just one massive model. The practical implications are that you will want to get familiar with tstats append=t' (requisite David Veuve reference: "How to Scale: From _raw to tstats [and beyond!])

Example - BOTS Dataset

To hit home the potential impact of these changes, let’s borrow a page from James Brodsky’s .conf17 session, "Splunking the Endpoint Part III: Hands-On with BOTS Data!".

One of the searches in the detailed guide (“APT STEP 8 – Unusually long command line executions with custom data model!”), leverages a modified “Application State” data model:

| tstats values(all_application_state.process) as command

FROM datamodel="Application_State" where (host=venus OR

host=wrk-*) AND (all_application_state.commandline != *chrome*

AND all_application_state.commandline != *firefox*) groupby host

| mvexpand command

| eval cmdlen=len(command)

| eventstats stdev(cmdlen) as stdev,avg(cmdlen) as avg by host

| stats max(cmdlen) as maxlen, values(stdev) as stdevperhost,

values(avg) as avgperhost by host,command

| where maxlen>4*(stdevperhost)+avgperhost

With the new Endpoint model, it will look something like the search below. Note that we’re populating the “process” field with the entire command line. For all you Splunk admins, this is a props.conf change you’ll want to make with your sourcetypes, given it’s now a calculated field in the data model!).

| tstats values(Processes.process) as command

FROM datamodel="Endpoint"."Processes"  WHERE (Endpoint.Proceses.process != *chrome*

AND Endpoint.Processes.process != *firefox*) groupby host

| mvexpand command

| eval cmdlen=len(command)

| eventstats stdev(cmdlen) as stdev,avg(cmdlen) as avg by host

| stats max(cmdlen) as maxlen, values(stdev) as stdevperhost,

values(avg) as avgperhost by host,command

| where maxlen>4*(stdevperhost)+avgperhost

While this is a fairly trivial example, the point is that endpoint telemetry is now a first-class dataset in Splunk. For those of you who use the wonderful and highly-useful TA-Sysmon, the illustrious Dave Herrald and Bhavin Patel have added support for the new Endpoint model in v8.1.0.

Call to Action/What’s Next

Now that we have this great new Endpoint model, what’s next? We have some updates planned in the future to help solve some use cases that have already popped up since the release. However, while we did add some backwards compatibility with the tags and event types for Application_State and Change_Analysis*, you should update those props and transforms to ensure you’re taking full advantage of all the new fields. Speaking of which, I would highly suggest a readup of how the Splunk Security Research Team is leveraging this new capability in their blog post, "Get More Flexibility and Accelerated Searches with the New Endpoint Data Model."

*Quick note on the topic of backwards compatibility, you will notice in the documentation that Change Analysis and Application State are marked as "deprecated." CIM 4.12 absolutely still ships with these, but ES has acceleration turned off by default, in lieu of accelerating the new Endpoint and Change models.

Thank You

Hopefully you’ll find the new Endpoint data model valuable and useful in service to all your endpoint detection and threat-hunting needs. For the list of other changes in CIM 4.12, you can find them in the release notes.

Lastly, some thank yous to the larger Splunker community for all their help during this "process":

A major shout out to the Splunk Security Research Team for their deep expertise and content (Splunk ESCU FTW): @trogdorsey, @hackpsy, and @rvaldez
Another major shout out to the Security Specialists team for always being a voice for our customers and providing some killer use-cases (BOTS FTW): @meansec, @daveherrald, @stonerpsu, @davidveuve, @JesseTrucks
A super extra major shout out to our customers who are constantly providing the feedback that we crave to make our products better.
And of course anyone else whose Twitter handles I can’t recall—it certainly takes a village.

----------------------------------------------------
Thanks!
Kyle Champlin

Splunk

The world’s leading organizations trust Splunk to help keep their digital systems secure and reliable. Our software solutions and services help to prevent major issues, absorb shocks and accelerate transformation. Learn what Splunk does and why customers choose Splunk.

Security 6 Min Read

Imposters at the Gate: Spotting Remote Employment Fraud Before It Crosses the Wire

Remote Employment Fraud actors don’t steal credentials—they’re issued them. This blog explores early detection and why security can’t face this threat alone.

Security 7 Min Read

Overcome Cybersecurity Challenges to Improve Digital Resilience

Discover how embracing automation, unifying security operations and tackling security as a data problem helps organizations overcome the challenges posed to cybersecurity effectiveness and digital resilience.

Security 5 Min Read

7 High-Risk Events to Monitor Under GDPR: Lessons Learned from the ICO’s BA Penalty Notice

British Airways made the headlines when they were hacked, customer details stolen and were issued a Penalty Notice by the UK ICO. Matthias Maier took a closer look at the document and recapitulated the key takeaways any IT security person can learn from.

About Splunk

The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.

Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.

Learn more about Splunk

Subscribe to our blog

Get the latest articles from Splunk straight to your inbox.

Connect with Splunk on X

Follow @Splunk

Connect with Splunk on Instagram