Vulnerabilities > CVE-2021-46939 - Improper Synchronization vulnerability in Linux Kernel
Summary
In the Linux kernel, the following vulnerability has been resolved: tracing: Restructure trace_clock_global() to never block It was reported that a fix to the ring buffer recursion detection would cause a hung machine when performing suspend / resume testing. The following backtrace was extracted from debugging that case: Call Trace: trace_clock_global+0x91/0xa0 __rb_reserve_next+0x237/0x460 ring_buffer_lock_reserve+0x12a/0x3f0 trace_buffer_lock_reserve+0x10/0x50 __trace_graph_return+0x1f/0x80 trace_graph_return+0xb7/0xf0 ? trace_clock_global+0x91/0xa0 ftrace_return_to_handler+0x8b/0xf0 ? pv_hash+0xa0/0xa0 return_to_handler+0x15/0x30 ? ftrace_graph_caller+0xa0/0xa0 ? trace_clock_global+0x91/0xa0 ? __rb_reserve_next+0x237/0x460 ? ring_buffer_lock_reserve+0x12a/0x3f0 ? trace_event_buffer_lock_reserve+0x3c/0x120 ? trace_event_buffer_reserve+0x6b/0xc0 ? trace_event_raw_event_device_pm_callback_start+0x125/0x2d0 ? dpm_run_callback+0x3b/0xc0 ? pm_ops_is_empty+0x50/0x50 ? platform_get_irq_byname_optional+0x90/0x90 ? trace_device_pm_callback_start+0x82/0xd0 ? dpm_run_callback+0x49/0xc0 With the following RIP: RIP: 0010:native_queued_spin_lock_slowpath+0x69/0x200 Since the fix to the recursion detection would allow a single recursion to happen while tracing, this lead to the trace_clock_global() taking a spin lock and then trying to take it again: ring_buffer_lock_reserve() { trace_clock_global() { arch_spin_lock() { queued_spin_lock_slowpath() { /* lock taken */ (something else gets traced by function graph tracer) ring_buffer_lock_reserve() { trace_clock_global() { arch_spin_lock() { queued_spin_lock_slowpath() { /* DEAD LOCK! */ Tracing should *never* block, as it can lead to strange lockups like the above. Restructure the trace_clock_global() code to instead of simply taking a lock to update the recorded "prev_time" simply use it, as two events happening on two different CPUs that calls this at the same time, really doesn't matter which one goes first. Use a trylock to grab the lock for updating the prev_time, and if it fails, simply try again the next time. If it failed to be taken, that means something else is already updating it. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212761
Vulnerable Configurations
Common Weakness Enumeration (CWE)
Common Attack Pattern Enumeration and Classification (CAPEC)
- Forced Deadlock This attack attempts to trigger and exploit a deadlock condition in the target software to cause a denial of service. A deadlock can occur when two or more competing actions are waiting for each other to finish, and thus neither ever does. Deadlock condition are not easy to detect.
- Leveraging Race Conditions This attack targets a race condition occurring when multiple processes access and manipulate the same resource concurrently and the outcome of the execution depends on the particular order in which the access takes place. The attacker can leverage a race condition by "running the race", modifying the resource and modifying the normal execution flow. For instance a race condition can occur while accessing a file, the attacker can trick the system by replacing the original file with his version and cause the system to read the malicious file.
- Leveraging Race Conditions via Symbolic Links This attack leverages the use of symbolic links (Symlinks) in order to write to sensitive files. An attacker can create a Symlink link to a target file not otherwise accessible to her. When the privileged program tries to create a temporary file with the same name as the Symlink link, it will actually write to the target file pointed to by the attackers' Symlink link. If the attacker can insert malicious content in the temporary file she will be writing to the sensitive file by using the Symlink. The race occurs because the system checks if the temporary file exists, then creates the file. The attacker would typically create the Symlink during the interval between the check and the creation of the temporary file.
- Leveraging Time-of-Check and Time-of-Use (TOCTOU) Race Conditions This attack targets a race condition occurring between the time of check (state) for a resource and the time of use of a resource. The typical example is the file access. The attacker can leverage a file access race condition by "running the race", meaning that he would modify the resource between the first time the target program accesses the file and the time the target program uses the file. During that period of time, the attacker could do something such as replace the file and cause an escalation of privilege.
References
- https://git.kernel.org/stable/c/91ca6f6a91f679c8645d7f3307e03ce86ad518c4
- https://git.kernel.org/stable/c/859b47a43f5a0e5b9a92b621dc6ceaad39fb5c8b
- https://git.kernel.org/stable/c/1fca00920327be96f3318224f502e4d5460f9545
- https://git.kernel.org/stable/c/d43d56dbf452ccecc1ec735cd4b6840118005d7c
- https://git.kernel.org/stable/c/c64da3294a7d59a4bf6874c664c13be892f15f44
- https://git.kernel.org/stable/c/a33614d52e97fc8077eb0b292189ca7d964cc534
- https://git.kernel.org/stable/c/6e2418576228eeb12e7ba82edb8f9500623942ff
- https://git.kernel.org/stable/c/2a1bd74b8186d7938bf004f5603f25b84785f63e
- https://git.kernel.org/stable/c/aafe104aa9096827a429bc1358f8260ee565b7cc