Enhance Security Resilience Through Splunk User Behavior Analytics VPN Models

By Xiao Lin

The COVID-19 pandemic has spurred a significant increase in the adoption of remote access, resulting in a substantial portion of the workforce transitioning to remote work. This requires employees to heavily rely on their employer’s virtual private network (VPN) to connect to their company's IT systems. This shift to working from home (WFH) is expected to continue well into the foreseeable future. Furthermore, employees now commonly work remotely from various locations, such as coffee shops, hotels, airports, in addition to their own homes and company offices. The diversity of this remote workforce presents a challenge when it comes to monitoring VPN activities. At the same time, the heightened volume of VPN use has brought security threat events into closer proximity with VPN sessions, offering an opportunity to leverage VPN connection data for the detection of potential security threats. As a result, the security of VPN logins has become a significant concern, leading Security Operations Centers (SOCs) to search for more effective strategies to prevent data breaches and fortify defenses against threats that exploit VPN security vulnerabilities.

A variety of analytics have been developed to detect anomalies in VPN connections, with most of these methods primarily relying on isolated indicators to differentiate anomalies or potential threats from typical baselines. These indicators encompass elements such as the number of successful VPN logins, login failures, session durations, data transfer volumes during a session, and the rarity of a VPN client's geo-location initiation. The effectiveness of these detection techniques depends on certain assumptions about normal behavioral patterns and the establishment of appropriate thresholds for each individual indicator. However, these approaches have inherent limitations as they do not consider the interplay of correlated events or the broader context of VPN connections. Consequently, these analytics inevitably lead to a high number of false positives.

Splunk User Behavior Analytics (UBA) offers a machine learning model known as "Abnormal VPN Session Associated with Rare Location" to tackle this particular challenge. This model takes into account a variety of behaviors, including multiple VPN geo-locations induced by remote work. Instead of relying solely on geo-location, this model may use multiple post VPN login behaviors, in addition to common VPN directly related indicators such as login count, length, and other indicators in aggregation to identify abnormal VPN sessions. Rather than assuming a fixed pattern, this model employs a machine learning approach to uncover the most infrequent user behaviors, allowing it to detect previously unknown threats (unknown unknowns). Additionally, the model can utilize the work-from-home (WFH) status to create peer groups, enhancing analytical accuracy and ensuring effective detection resolution. This innovative approach offers several advantages compared to other models and can significantly reduce the occurrence of false positives in VPN detection.

Methodology

After logging into a VPN, individuals often follow consistent patterns of activity. For instance, a developer may synchronize their code with the repository, check their email, and engage in daily conversations on Slack. With this insight, it becomes natural to identify deviations from these typical behaviors as anomalies. The logic of an abnormal VPN session associated with rare location detection models can be illustrated in the below diagram.

Data events are initially ingested from various sources and stored within Splunk UBA in corresponding databases, often referred to as cubes. As the default behavior, the model utilizes the preceding 30 days of historical events to train a baseline model for detection. This baseline consists of two key elements: the identification of frequent item patterns and the assignment of baseline scores to each pattern. Subsequently, on the current day, the model extracts today's event patterns, cross-references them with historical patterns, computes a proprietary factor, and then compares this factor against a predefined threshold. If the factor surpasses the threshold, the pattern is categorized as positive, and the associated event is logged as an anomaly within the context of VPN sessions.

Frequent pattern mining is a data mining technique employed to reveal recurring patterns or associations within a dataset. This process entails the thorough analysis of extensive datasets in order to pinpoint items or sets of items that frequently appear together. In this model, the specific items considered 'frequent' may vary depending on the data sources, which users are allowed to customize and further choose relevant features. In default configurations, this model utilizes data fields such as source/destination country and city, event class, bytes transferred, event code in Windows events, and service name in file operation events. A typical pattern extracted by this model is exemplified in the following example:

It is important to highlight that this methodology focuses on detecting anomalous work sessions connected to VPN login events, rather than just a single indicator for VPN logins. Consequently, the model can leverage various data sources, as depicted in the above diagram. These data sources are intended for the creation of features or indicators that describe VPN-related events. While these data sources are not mandatory, it is advisable to have at least one, such as CiscoSA, capable of generating VPN-related events to initiate the analytical process outlined in this model.

The model is capable of extracting common occurrences on a global scale (referred to as "WLD"), where it analyzes all events collectively across the world and assesses their frequency. Moreover, the model can also identify prevalent occurrences within specific peer groups (denoted as "PG"), which involves scrutinizing events originating from the same peer group to uncover common patterns. Similar to other detection models, the abnormal VPN sessions model can use the peer group clustered using static data sources such as organizational units, HR records, or active directory data. Alternatively, it can be dynamically assembled based on the behavior of users and devices. The peer grouping process is illustrated through the flowchart provided below. Following the identification of frequent patterns, these patterns within the same peer group are evaluated separately against a consistent anomaly threshold that is shared across all groups.

In particular, when it comes to the abnormal VPN session model, entities are consistently organized into two distinct categories based on VPN traffic, which are segregated using a predefined parameter known as the "highTrafficVolumeEntityThreshold." This peer grouping procedure is characterized as a dynamic grouping approach. The result of this peer grouping process is not able to be queried in the same way for other approaches, as shown in the below figure.

Experiments

We conducted an experiment using a Splunk internal dataset with real events to detect an implanted anomaly using the model of Abnormal VPN Session with Rare Location. In this experiment, we defined an attacker as a regular employee as shown in HR data.

Approximately 2 million events, including 3,000 VPN login seed events, have been ingested. Most of the VPN connections originate from San Jose, USA, which is considered a normal geolocation. However, there was also an abnormal login from an external country. After running the model, the system successfully detected this anomaly and reported it on the anomaly page. Clicking on this anomaly will lead to a summary page displaying more information:

The report compiles a record of unusual events and data linked to this login, along with a visual representation of the VPN connection pathway.

Learn More

Splunk UBA provides a series of models to detect anomalies associated with VPN login activities to monitor a variety of indicators, such as length of the login, location of login device, etc., as shown in the below table.

If you would like to learn more about VPN related anomaly models and detections, you can refer to Splunk UBA documentations here.

Feedback

Any feedback or requests? Feel free to put in an issue in Splunk Ideas and we’ll follow up. Alternatively, join us on the Slack channel. Follow these instructions if you need an invitation to our Splunk user groups on Slack.

Acknowledgments

Special thanks to the Splunk Product Marketing Team, as well as the Splunk Machine Learning for Security (SMLS) team members for their contribution to this post and corresponding detections: Ania Kacewicz (Senior Security Data Scientist), Cui Lin (Principal Security Data Scientist), Glory Avina (Senior Manager for SMLS), and Rod Soto (Senior Principal Threat Researcher).

Observability Meets Security: Build a Baseline To Climb the PEAK

Splunker James Hodgkinson looks at how to apply the baseline hunting process to some common O11y data sources and shows how the OpenTelemetry standard offers easier data analysis.

Security 10 Min Read

Cracking Braodo Stealer: Analyzing Python Malware and Its Obfuscated Loader

The Splunk Threat Research Team break down Braodo Stealer's loader mechanisms, obfuscation strategies, and payload behavior.

Security 5 Min Read

Add to Chrome? - Part 2: How We Did Our Research

SURGe explores the analysis pipeline in more detail and digs into the two main phases of this research – how the team collected the data and how they analyzed it.

About Splunk

The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.

Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.