OpenSSH, an application installed by default on nearly every Unix-like and Linux system, has recently come under scrutiny due to a critical vulnerability discovered by Qualys.
Designated as CVE-2024-6387 and aptly named "regreSSHion," this flaw exposes Linux environments to remote unauthenticated code execution. The implications of this vulnerability are far-reaching, potentially affecting countless servers and infrastructure components across the globe.
In this blog post, the Splunk Threat Research Team will dissect the technical intricacies of CVE-2024-6387, explore its potential impact on affected systems, and provide detection opportunities and mitigation strategies.
Key points about CVE-2024-6387:
CVE-2024-6387 stems from a signal handler race condition in OpenSSH, affecting versions from 8.5p1 to 9.8p1 on glibc-based Linux systems. The flaw, a regression of an older vulnerability (CVE-2006-5051), allows remote attackers to execute arbitrary code as root, leading to full system compromise.
The nature of the signal handler race condition: The vulnerability occurs in OpenSSH's server (sshd) when handling the SIGALRM signal. In the affected versions, the SIGALRM handler calls functions that are not async-signal-safe, such as syslog(). This creates a race condition where the signal handler can interrupt critical sections of code, potentially leaving the system in an inconsistent state. Attackers can exploit this inconsistency to manipulate memory and execute arbitrary code.
Relation to the older CVE-2006-5051: Interestingly, this vulnerability is a regression of CVE-2006-5051, which was originally patched in 2006. The old vulnerability was also a signal handler race condition in OpenSSH. The fix implemented then inadvertently introduced a new vulnerability when changes were made to the logging infrastructure in OpenSSH 8.5p1 in October 2020.
This regression highlights the challenges of maintaining security in complex software systems over time.
Specific conditions for exploitation: Exploiting this vulnerability requires several conditions to be met:
The complexity of this exploit means that while the vulnerability is severe, successful attacks require a high level of skill and persistence. Nevertheless, the potential for remote code execution with root privileges makes this a critical issue for all affected systems.
The "regreSSHion" proof of concept exploit that was released on GitHub leverages a complex race condition, requiring precise timing and potentially thousands of attempts to succeed. Let's break down the key aspects:
Timing and Duration. The exploit's success hinges on hitting a very narrow time window. According to the Qualys research:
This timing is crucial because the exploit must interrupt specific operations at just the right moment to manipulate the system's memory state.
OS and Architecture. The PoC primarily targets 32-bit (x86) systems, though work on a 64-bit (amd64) version was in progress. Key points:
Key Functions in the proof of concept. Let's look at two crucial functions in the proof of concept exploit.
a) prepare_heap():
This function is similar to setting up a complicated domino pattern. It arranges the computer's memory (the heap) in a very specific way, creating the perfect conditions for the exploit to work. Here's what it does:
The goal is to create a predictable layout in the computer's memory that the exploit can take advantage of later.
void prepare_heap(int sock) {
// Packet a: Allocate and free tcache chunks
for (int i = 0; i < 10; i++) {
unsigned char tcache_chunk[64];
memset(tcache_chunk, 'A', sizeof(tcache_chunk));
send_packet(sock, 5, tcache_chunk, sizeof(tcache_chunk));
}
// Packet b: Create 27 pairs of large (~8KB) and small (320B) holes
for (int i = 0; i < 27; i++) {
unsigned char large_hole[8192];
memset(large_hole, 'B', sizeof(large_hole));
send_packet(sock, 5, large_hole, sizeof(large_hole));
unsigned char small_hole[320];
memset(small_hole, 'C', sizeof(small_hole));
send_packet(sock, 5, small_hole, sizeof(small_hole));
}
// ... more heap manipulation ...
}
This function is where the actual exploit attempt happens. It's like trying to knock over our domino pattern at exactly the right moment to create a specific effect. Here's how it works:
The exploit works by trying to interrupt the server's normal operation at a very specific moment. If timed correctly, it can manipulate the server's memory in a way that allows the attacker to run their own code with high-level (root) permissions.
int attempt_race_condition(int sock, double parsing_time, uint64_t glibc_base) {
unsigned char final_packet[MAX_PACKET_SIZE];
create_public_key_packet(final_packet, sizeof(final_packet), glibc_base);
// Send all but the last byte
if (send(sock, final_packet, sizeof(final_packet) - 1, 0) < 0) {
perror("send final packet");
return 0;
}
// Precise timing for last byte
struct timespec start, current;
clock_gettime(CLOCK_MONOTONIC, &start);
while (1) {
clock_gettime(CLOCK_MONOTONIC, ¤t);
double elapsed = (current.tv_sec - start.tv_sec)
+ (current.tv_nsec - start.tv_nsec) / 1e9;
if (elapsed >= (LOGIN_GRACE_TIME - parsing_time - 0.001)) { // 1ms before SIGALRM
if (send(sock, &final_packet[sizeof(final_packet) - 1], 1, 0) < 0) {
perror("send last byte");
return 0;
}
break;
}
}
// ... check for successful exploitation ...
}
This process is extremely precise and requires many attempts to succeed. It's like trying to thread a needle while riding a rollercoaster — it might take thousands of tries to get it right. But if successful, it gives the attacker complete control over the server.
The complexity and precision required make this exploit challenging to pull off in real-world conditions. However, the potential impact if successful is severe, as it could allow an attacker to gain full control of a system without needing any login credentials. This is why it's crucial for system administrators to update their OpenSSH installations and implement other security measures to protect against such attacks.
The Splunk Threat Research team has developed a collection of Linux analytic stories that align with activity related to post-exploitation behaviors, providing defenders with powerful detection capabilities for Linux environments. Four key stories stand out in this collection:
These stories encompass a wide range of detection techniques, from anomaly detection to specific TTPs (Tactics, Techniques, and Procedures), all aligned with the MITRE ATT&CK framework. By implementing these analytic stories, organizations can significantly enhance their ability to detect, investigate, and respond to advanced threats targeting their Linux infrastructure across various stages of post-exploitation activity.
To conduct an inventory search and verify if the OpenSSH version installed on an Ubuntu host is vulnerable to a specific exploit or attack, we can create a Splunk search query. This query will analyze package audit logs to identify the OpenSSH versions installed across our production network.
By leveraging the package audit logs, we can systematically track and audit all instances of OpenSSH installations. This approach ensures that we have a comprehensive overview of the software versions in use. Consequently, we can quickly identify any versions that are susceptible to known vulnerabilities.
Below is the simple search we created for this inventory:
index=unix source=package NAME= "*openssh*"
| rex field=VERSION "^1:(?<ssh_version>\d+\.\d+)"
| eval ssh_version_number = tonumber(ssh_version)
| eval vulnerable_ssh_version = if(ssh_version_number >= 8.5 AND ssh_version_number < 9.8, "Vulnerable SSH Version", "SSH Version not Vulnerable")
| stats count by NAME VENDOR ssh_version ssh_version_number VERSION vulnerable_ssh_version
(Splunk query identifying OpenSSH packages, Splunk 2024)
Look for increased rates of “timeout before authentication” logs from sshd, on hosts running versions without the patch. Alternatively look for high levels of new connections if network monitoring is in place, as the attack requires repeated fresh login attempts.
sourcetype=journald OR sourcetype=linux:auth OR TERM(sshd) OR TERM(ssh)
TERM(Timeout) TERM(authentication) "Timeout before authentication"
| timechart count by host
(Splunk query identifying Timeout Before Authentication, Splunk 2024)
While patching is the most effective solution, there are several mitigation strategies organizations can employ to reduce their risk exposure to the CVE-2024-6387 vulnerability:
Patch Management
Access Control
Host-based Intrusion Prevention
SSH Configuration Hardening
Adjust the following settings in sshd_config to limit the effectiveness of attackers:
Network Segmentation
Multi-Factor Authentication (MFA): Implement MFA for SSH access as an additional layer of security. This doesn't prevent the initial exploit but can limit an attacker's ability to leverage compromised credentials.
Monitoring and Alerting
Alternative Access Methods: For critical systems, consider implementing alternative secure remote access methods that don't rely on SSH, such as VPN solutions with strong authentication.
Snort: Network defenders can also leverage signature-based detection. Cisco Talos has released a Snort signature (SID: 63659) to detect exploitation attempts of CVE-2024-6387. This provides an additional layer of protection for organizations using Snort in their security infrastructure.
Remember, these mitigations should be part of a defense-in-depth strategy. No single measure is foolproof, and the most effective protection comes from combining multiple security layers. Regularly review and update your security measures to address new threats and vulnerabilities as they emerge.
You can find the latest content about security analytic stories on GitHub and in the Splunk ES Content Update app. Splunk Security Essentials also has all these detections now available via push update. For a full list of security content, check out the release notes on Splunk Docs.
Any feedback or requests? Feel free to put in an issue on GitHub and we’ll follow up. Alternatively, join us on the Slack channel #security-research. Follow these instructions If you need an invitation to our Splunk user groups on Slack.
We would like to thank Michael Haag and Teoderick Contreras for authoring this post and the entire Splunk Threat Research Team for their contributions: Lou Stella, Bhavin Patel, Rod Soto, Eric McGinnis, Jose Hernandez and Patrick Bareiss. In addition James Hodgkinson of SURGe, and Jonathan Heckinger. Cisco Talos Michael Gentile and Keith Lehrschall.
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.