Insane Vulnerability In OpenSSH Discovered
Based on The PrimeTime's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
OpenSSH sshd can become exploitable when SIGALRM runs asynchronously after the login grace time and invokes async-signal-unsafe code, enabling heap inconsistency.
Briefing
OpenSSH’s sshd has a remote-code-execution path tied to a signal-handler race: if an unauthenticated client fails to authenticate within the login grace time (120 seconds by default), sshd’s SIGALRM handler can run asynchronously and call functions that are not async-signal-safe. That mismatch—signal context interrupting unsafe heap/logging code—creates an inconsistent memory state that attackers can steer into arbitrary code execution, including remote root shells on affected Linux systems.
Researchers traced the issue to a regression of older OpenSSH signal-handler bugs (notably CVE-2006-5051). The regression was introduced by a 2020 change to OpenSSH’s logging infrastructure (OpenSSH 8.5 P1), which accidentally removed an ifdef that previously ensured “log safe” behavior inside the SIGALRM handler. As a result, OpenSSH versions from 4.4 up through 8.5 P1 are vulnerable in the default configuration, while versions outside that window are not—because the “log safe” guard is present again.
Exploitation hinges on timing and heap manipulation. The core strategy is to repeatedly trigger the SIGALRM handler while sshd is inside specific malloc/free code paths—especially those reached during public-key parsing. In older, easier-to-exploit OpenSSH builds, the work focused on interrupting free() calls inside parsing logic, leaving the heap in a state that can be exploited during a subsequent free() inside the SIGALRM handler. Reported success rates are low but real: experiments averaged about 10,000 attempts per successful race, translating to roughly a week to obtain a remote root shell under default settings (with parameters like max startups and the 120-second grace time).
The research then shows the attack is not limited to one code path or one OpenSSH generation. For Debian and Ubuntu builds that remain vulnerable to the earlier regression, attackers used different parsing targets (e.g., DSA public key parsing) and improved odds by expanding the “race window” into many smaller windows—turning one interrupt opportunity into dozens of free() opportunities within the same grace-time interval. Later, the team adapted to newer mitigations and different libc behavior by chaining heap exploitation techniques (including “unlink” style manipulation and “House of the Mind” style arena corruption) to gain control over function pointers.
On modern Linux, the SIGALRM handler’s behavior becomes the linchpin. The team found that glibc’s malloc-family functions can be reached from within the SIGALRM path via CIS log, and that the relevant libc code paths can be interrupted to create exploitable heap inconsistencies. For i386, they describe a method that overwrites a single-byte vtable offset in a glibc file-structure object (a TZ file read structure), redirecting execution through controlled function pointers during the SIGALRM-triggered cleanup.
Mitigations are practical but nuanced. A June 6, 2024 fix moves the problematic behavior out of the asynchronous signal handler by penalizing problematic client behavior and routing the unsafe work to a synchronous listener process. If updating isn’t possible, setting login grace time to zero prevents the remote-code-execution race (but can turn the issue into a denial-of-service risk by exhausting max startup connections). The write-up also notes that some platforms (notably OpenBSD) avoid the issue because their SIGALRM handler uses an async-signal-safe logging variant.
Overall, the core finding is that a small regression in “signal-safe logging” reintroduced an async-signal-unsafe execution path in sshd. With enough timing precision and heap shaping, that turns an authentication grace-time timeout into a pathway to remote root on affected Linux distributions.
Cornell Notes
The vulnerability is a regression in OpenSSH’s sshd where the SIGALRM handler (triggered when unauthenticated clients miss the login grace time) calls functions that are not async-signal-safe. That lets attackers interrupt malloc/free and related heap/logging code mid-operation, leaving the heap in an inconsistent state that can be exploited during later SIGALRM-triggered cleanup. Researchers linked the regression to CVE-2006-5051 and traced it to a 2020 OpenSSH logging change (OpenSSH 8.5 P1) that removed a “log safe” guard for the signal handler. Exploitation required heavy timing and heap grooming, but experiments achieved remote root shells on affected Linux systems with on the order of 10,000 attempts per successful race. Fixes include moving the unsafe work out of the signal handler and, as a fallback, setting login grace time to zero to prevent the RCE race (at the cost of potential DoS via connection exhaustion).
Why does an authentication timeout (login grace time) become a remote-code-execution trigger?
What changed in OpenSSH that reintroduced the bug?
How do attackers make the race condition practical rather than purely theoretical?
How did the attack adapt to newer versions with stronger mitigations?
What mitigations reduce risk, and what trade-offs remain?
Review Questions
- Which specific condition triggers the vulnerable SIGALRM handler path in sshd, and what property of the handler makes it exploitable?
- How did the 2020 logging change (OpenSSH 8.5 P1) affect the presence or absence of the “log safe” guard in the signal handler?
- Why does expanding a “large race window” into many “small race windows” increase the attacker’s odds of winning the race?
Key Points
- 1
OpenSSH sshd can become exploitable when SIGALRM runs asynchronously after the login grace time and invokes async-signal-unsafe code, enabling heap inconsistency.
- 2
The vulnerability is a regression tied to CVE-2006-5051 and was reintroduced by a logging-infrastructure change in OpenSSH 8.5 P1 that removed a “log safe” guard for the signal handler.
- 3
Affected OpenSSH versions include those from 4.4 up through 8.5 P1 under default configuration, while versions outside that window are described as not vulnerable.
- 4
Exploitation relies on interrupting malloc/free-related operations during public-key parsing, then leveraging SIGALRM-triggered cleanup to act on corrupted heap metadata.
- 5
Attack reliability improves by turning one interrupt opportunity into many smaller race windows within the same grace-time interval.
- 6
Mitigations include the June 6, 2024 fix that routes unsafe work away from the signal handler into a synchronous process, and a fallback of setting login grace time to zero (which trades RCE risk for potential DoS).