| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
net: phy: transfer phy_config_inband() locking responsibility to phylink
Problem description
===================
Lockdep reports a possible circular locking dependency (AB/BA) between
&pl->state_mutex and &phy->lock, as follows.
phylink_resolve() // acquires &pl->state_mutex
-> phylink_major_config()
-> phy_config_inband() // acquires &pl->phydev->lock
whereas all the other call sites where &pl->state_mutex and
&pl->phydev->lock have the locking scheme reversed. Everywhere else,
&pl->phydev->lock is acquired at the top level, and &pl->state_mutex at
the lower level. A clear example is phylink_bringup_phy().
The outlier is the newly introduced phy_config_inband() and the existing
lock order is the correct one. To understand why it cannot be the other
way around, it is sufficient to consider phylink_phy_change(), phylink's
callback from the PHY device's phy->phy_link_change() virtual method,
invoked by the PHY state machine.
phy_link_up() and phy_link_down(), the (indirect) callers of
phylink_phy_change(), are called with &phydev->lock acquired.
Then phylink_phy_change() acquires its own &pl->state_mutex, to
serialize changes made to its pl->phy_state and pl->link_config.
So all other instances of &pl->state_mutex and &phydev->lock must be
consistent with this order.
Problem impact
==============
I think the kernel runs a serious deadlock risk if an existing
phylink_resolve() thread, which results in a phy_config_inband() call,
is concurrent with a phy_link_up() or phy_link_down() call, which will
deadlock on &pl->state_mutex in phylink_phy_change(). Practically
speaking, the impact may be limited by the slow speed of the medium
auto-negotiation protocol, which makes it unlikely for the current state
to still be unresolved when a new one is detected, but I think the
problem is there. Nonetheless, the problem was discovered using lockdep.
Proposed solution
=================
Practically speaking, the phy_config_inband() requirement of having
phydev->lock acquired must transfer to the caller (phylink is the only
caller). There, it must bubble up until immediately before
&pl->state_mutex is acquired, for the cases where that takes place.
Solution details, considerations, notes
=======================================
This is the phy_config_inband() call graph:
sfp_upstream_ops :: connect_phy()
|
v
phylink_sfp_connect_phy()
|
v
phylink_sfp_config_phy()
|
| sfp_upstream_ops :: module_insert()
| |
| v
| phylink_sfp_module_insert()
| |
| | sfp_upstream_ops :: module_start()
| | |
| | v
| | phylink_sfp_module_start()
| | |
| v v
| phylink_sfp_config_optical()
phylink_start() | |
| phylink_resume() v v
| | phylink_sfp_set_config()
| | |
v v v
phylink_mac_initial_config()
| phylink_resolve()
| | phylink_ethtool_ksettings_set()
v v v
phylink_major_config()
|
v
phy_config_inband()
phylink_major_config() caller #1, phylink_mac_initial_config(), does not
acquire &pl->state_mutex nor do its callers. It must acquire
&pl->phydev->lock prior to calling phylink_major_config().
phylink_major_config() caller #2, phylink_resolve() acquires
&pl->state_mutex, thus also needs to acquire &pl->phydev->lock.
phylink_major_config() caller #3, phylink_ethtool_ksettings_set(), is
completely uninteresting, because it only call
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
led: qcom-lpg: Fix sleeping in atomic
lpg_brighness_set() function can sleep, while led's brightness_set()
callback must be non-blocking. Change LPG driver to use
brightness_set_blocking() instead.
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/0
preempt_count: 101, expected: 0
INFO: lockdep is turned off.
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 6.1.0-rc1-00014-gbe99b089c6fc-dirty #85
Hardware name: Qualcomm Technologies, Inc. DB820c (DT)
Call trace:
dump_backtrace.part.0+0xe4/0xf0
show_stack+0x18/0x40
dump_stack_lvl+0x88/0xb4
dump_stack+0x18/0x34
__might_resched+0x170/0x254
__might_sleep+0x48/0x9c
__mutex_lock+0x4c/0x400
mutex_lock_nested+0x2c/0x40
lpg_brightness_single_set+0x40/0x90
led_set_brightness_nosleep+0x34/0x60
led_heartbeat_function+0x80/0x170
call_timer_fn+0xb8/0x340
__run_timers.part.0+0x20c/0x254
run_timer_softirq+0x3c/0x7c
_stext+0x14c/0x578
____do_softirq+0x10/0x20
call_on_irq_stack+0x2c/0x5c
do_softirq_own_stack+0x1c/0x30
__irq_exit_rcu+0x164/0x170
irq_exit_rcu+0x10/0x40
el1_interrupt+0x38/0x50
el1h_64_irq_handler+0x18/0x2c
el1h_64_irq+0x64/0x68
cpuidle_enter_state+0xc8/0x380
cpuidle_enter+0x38/0x50
do_idle+0x244/0x2d0
cpu_startup_entry+0x24/0x30
rest_init+0x128/0x1a0
arch_post_acpi_subsys_init+0x0/0x18
start_kernel+0x6f4/0x734
__primary_switched+0xbc/0xc4 |
| In the Linux kernel, the following vulnerability has been resolved:
padata: Always leave BHs disabled when running ->parallel()
A deadlock can happen when an overloaded system runs ->parallel() in the
context of the current task:
padata_do_parallel
->parallel()
pcrypt_aead_enc/dec
padata_do_serial
spin_lock(&reorder->lock) // BHs still enabled
<interrupt>
...
__do_softirq
...
padata_do_serial
spin_lock(&reorder->lock)
It's a bug for BHs to be on in _do_serial as Steffen points out, so
ensure they're off in the "current task" case like they are in
padata_parallel_worker to avoid this situation. |
| A post-authentication flaw in the network two-phase commit protocol used for cross-shard transactions in MongoDB Server may lead to logical data inconsistencies under specific conditions which are not predictable and exist for a very short period of time. This error can cause the transaction coordination logic to misinterpret the transaction as committed, resulting in inconsistent state on those shards. This may lead to low integrity and availability impact.
This issue impacts MongoDB Server v8.0 versions prior to 8.0.16, MongoDB Server v7.0 versions prior to 7.0.26 and MongoDB server v8.2 versions prior to 8.2.2. |
| A flaw was found in the X server's request handling. Non-zero 'bytes to ignore' in a client's request can cause the server to skip processing another client's request, potentially leading to a denial of service. |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix deadlock when aborting transaction during relocation with scrub
Before relocating a block group we pause scrub, then do the relocation and
then unpause scrub. The relocation process requires starting and committing
a transaction, and if we have a failure in the critical section of the
transaction commit path (transaction state >= TRANS_STATE_COMMIT_START),
we will deadlock if there is a paused scrub.
That results in stack traces like the following:
[42.479] BTRFS info (device sdc): relocating block group 53876686848 flags metadata|raid6
[42.936] BTRFS warning (device sdc): Skipping commit of aborted transaction.
[42.936] ------------[ cut here ]------------
[42.936] BTRFS: Transaction aborted (error -28)
[42.936] WARNING: CPU: 11 PID: 346822 at fs/btrfs/transaction.c:1977 btrfs_commit_transaction+0xcc8/0xeb0 [btrfs]
[42.936] Modules linked in: dm_flakey dm_mod loop btrfs (...)
[42.936] CPU: 11 PID: 346822 Comm: btrfs Tainted: G W 6.3.0-rc2-btrfs-next-127+ #1
[42.936] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[42.936] RIP: 0010:btrfs_commit_transaction+0xcc8/0xeb0 [btrfs]
[42.936] Code: ff ff 45 8b (...)
[42.936] RSP: 0018:ffffb58649633b48 EFLAGS: 00010282
[42.936] RAX: 0000000000000000 RBX: ffff8be6ef4d5bd8 RCX: 0000000000000000
[42.936] RDX: 0000000000000002 RSI: ffffffffb35e7782 RDI: 00000000ffffffff
[42.936] RBP: ffff8be6ef4d5c98 R08: 0000000000000000 R09: ffffb586496339e8
[42.936] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8be6d38c7c00
[42.936] R13: 00000000ffffffe4 R14: ffff8be6c268c000 R15: ffff8be6ef4d5cf0
[42.936] FS: 00007f381a82b340(0000) GS:ffff8beddfcc0000(0000) knlGS:0000000000000000
[42.936] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[42.936] CR2: 00007f1e35fb7638 CR3: 0000000117680006 CR4: 0000000000370ee0
[42.936] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[42.936] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[42.936] Call Trace:
[42.936] <TASK>
[42.936] ? start_transaction+0xcb/0x610 [btrfs]
[42.936] prepare_to_relocate+0x111/0x1a0 [btrfs]
[42.936] relocate_block_group+0x57/0x5d0 [btrfs]
[42.936] ? btrfs_wait_nocow_writers+0x25/0xb0 [btrfs]
[42.936] btrfs_relocate_block_group+0x248/0x3c0 [btrfs]
[42.936] ? __pfx_autoremove_wake_function+0x10/0x10
[42.936] btrfs_relocate_chunk+0x3b/0x150 [btrfs]
[42.936] btrfs_balance+0x8ff/0x11d0 [btrfs]
[42.936] ? __kmem_cache_alloc_node+0x14a/0x410
[42.936] btrfs_ioctl+0x2334/0x32c0 [btrfs]
[42.937] ? mod_objcg_state+0xd2/0x360
[42.937] ? refill_obj_stock+0xb0/0x160
[42.937] ? seq_release+0x25/0x30
[42.937] ? __rseq_handle_notify_resume+0x3b5/0x4b0
[42.937] ? percpu_counter_add_batch+0x2e/0xa0
[42.937] ? __x64_sys_ioctl+0x88/0xc0
[42.937] __x64_sys_ioctl+0x88/0xc0
[42.937] do_syscall_64+0x38/0x90
[42.937] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[42.937] RIP: 0033:0x7f381a6ffe9b
[42.937] Code: 00 48 89 44 24 (...)
[42.937] RSP: 002b:00007ffd45ecf060 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[42.937] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f381a6ffe9b
[42.937] RDX: 00007ffd45ecf150 RSI: 00000000c4009420 RDI: 0000000000000003
[42.937] RBP: 0000000000000003 R08: 0000000000000013 R09: 0000000000000000
[42.937] R10: 00007f381a60c878 R11: 0000000000000246 R12: 00007ffd45ed0423
[42.937] R13: 00007ffd45ecf150 R14: 0000000000000000 R15: 00007ffd45ecf148
[42.937] </TASK>
[42.937] ---[ end trace 0000000000000000 ]---
[42.937] BTRFS: error (device sdc: state A) in cleanup_transaction:1977: errno=-28 No space left
[59.196] INFO: task btrfs:346772 blocked for more than 120 seconds.
[59.196] Tainted: G W 6.3.0-rc2-btrfs-next-127+ #1
[59.196] "echo 0 > /proc/sys/kernel/hung_
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
net/smc: fix deadlock triggered by cancel_delayed_work_syn()
The following LOCKDEP was detected:
Workqueue: events smc_lgr_free_work [smc]
WARNING: possible circular locking dependency detected
6.1.0-20221027.rc2.git8.56bc5b569087.300.fc36.s390x+debug #1 Not tainted
------------------------------------------------------
kworker/3:0/176251 is trying to acquire lock:
00000000f1467148 ((wq_completion)smc_tx_wq-00000000#2){+.+.}-{0:0},
at: __flush_workqueue+0x7a/0x4f0
but task is already holding lock:
0000037fffe97dc8 ((work_completion)(&(&lgr->free_work)->work)){+.+.}-{0:0},
at: process_one_work+0x232/0x730
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 ((work_completion)(&(&lgr->free_work)->work)){+.+.}-{0:0}:
__lock_acquire+0x58e/0xbd8
lock_acquire.part.0+0xe2/0x248
lock_acquire+0xac/0x1c8
__flush_work+0x76/0xf0
__cancel_work_timer+0x170/0x220
__smc_lgr_terminate.part.0+0x34/0x1c0 [smc]
smc_connect_rdma+0x15e/0x418 [smc]
__smc_connect+0x234/0x480 [smc]
smc_connect+0x1d6/0x230 [smc]
__sys_connect+0x90/0xc0
__do_sys_socketcall+0x186/0x370
__do_syscall+0x1da/0x208
system_call+0x82/0xb0
-> #3 (smc_client_lgr_pending){+.+.}-{3:3}:
__lock_acquire+0x58e/0xbd8
lock_acquire.part.0+0xe2/0x248
lock_acquire+0xac/0x1c8
__mutex_lock+0x96/0x8e8
mutex_lock_nested+0x32/0x40
smc_connect_rdma+0xa4/0x418 [smc]
__smc_connect+0x234/0x480 [smc]
smc_connect+0x1d6/0x230 [smc]
__sys_connect+0x90/0xc0
__do_sys_socketcall+0x186/0x370
__do_syscall+0x1da/0x208
system_call+0x82/0xb0
-> #2 (sk_lock-AF_SMC){+.+.}-{0:0}:
__lock_acquire+0x58e/0xbd8
lock_acquire.part.0+0xe2/0x248
lock_acquire+0xac/0x1c8
lock_sock_nested+0x46/0xa8
smc_tx_work+0x34/0x50 [smc]
process_one_work+0x30c/0x730
worker_thread+0x62/0x420
kthread+0x138/0x150
__ret_from_fork+0x3c/0x58
ret_from_fork+0xa/0x40
-> #1 ((work_completion)(&(&smc->conn.tx_work)->work)){+.+.}-{0:0}:
__lock_acquire+0x58e/0xbd8
lock_acquire.part.0+0xe2/0x248
lock_acquire+0xac/0x1c8
process_one_work+0x2bc/0x730
worker_thread+0x62/0x420
kthread+0x138/0x150
__ret_from_fork+0x3c/0x58
ret_from_fork+0xa/0x40
-> #0 ((wq_completion)smc_tx_wq-00000000#2){+.+.}-{0:0}:
check_prev_add+0xd8/0xe88
validate_chain+0x70c/0xb20
__lock_acquire+0x58e/0xbd8
lock_acquire.part.0+0xe2/0x248
lock_acquire+0xac/0x1c8
__flush_workqueue+0xaa/0x4f0
drain_workqueue+0xaa/0x158
destroy_workqueue+0x44/0x2d8
smc_lgr_free+0x9e/0xf8 [smc]
process_one_work+0x30c/0x730
worker_thread+0x62/0x420
kthread+0x138/0x150
__ret_from_fork+0x3c/0x58
ret_from_fork+0xa/0x40
other info that might help us debug this:
Chain exists of:
(wq_completion)smc_tx_wq-00000000#2
--> smc_client_lgr_pending
--> (work_completion)(&(&lgr->free_work)->work)
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock((work_completion)(&(&lgr->free_work)->work));
lock(smc_client_lgr_pending);
lock((work_completion)
(&(&lgr->free_work)->work));
lock((wq_completion)smc_tx_wq-00000000#2);
*** DEADLOCK ***
2 locks held by kworker/3:0/176251:
#0: 0000000080183548
((wq_completion)events){+.+.}-{0:0},
at: process_one_work+0x232/0x730
#1: 0000037fffe97dc8
((work_completion)
(&(&lgr->free_work)->work)){+.+.}-{0:0},
at: process_one_work+0x232/0x730
stack backtr
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: Fix lockdep assertion on sync reset unload event
Fix lockdep assertion triggered during sync reset unload event. When the
sync reset flow is initiated using the devlink reload fw_activate
option, the PF already holds the devlink lock while handling unload
event. In this case, delegate sync reset unload event handling back to
the devlink callback process to avoid double-locking and resolve the
lockdep warning.
Kernel log:
WARNING: CPU: 9 PID: 1578 at devl_assert_locked+0x31/0x40
[...]
Call Trace:
<TASK>
mlx5_unload_one_devl_locked+0x2c/0xc0 [mlx5_core]
mlx5_sync_reset_unload_event+0xaf/0x2f0 [mlx5_core]
process_one_work+0x222/0x640
worker_thread+0x199/0x350
kthread+0x10b/0x230
? __pfx_worker_thread+0x10/0x10
? __pfx_kthread+0x10/0x10
ret_from_fork+0x8e/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK> |
| In the Linux kernel, the following vulnerability has been resolved:
drivers: staging: rtl8723bs: Fix locking in _rtw_join_timeout_handler()
Commit 041879b12ddb ("drivers: staging: rtl8192bs: Fix deadlock in
rtw_joinbss_event_prehandle()") besides fixing the deadlock also
modified _rtw_join_timeout_handler() to use spin_[un]lock_irq()
instead of spin_[un]lock_bh().
_rtw_join_timeout_handler() calls rtw_do_join() which takes
pmlmepriv->scanned_queue.lock using spin_[un]lock_bh(). This
spin_unlock_bh() call re-enables softirqs which triggers an oops in
kernel/softirq.c: __local_bh_enable_ip() when it calls
lockdep_assert_irqs_enabled():
[ 244.506087] WARNING: CPU: 2 PID: 0 at kernel/softirq.c:376 __local_bh_enable_ip+0xa6/0x100
...
[ 244.509022] Call Trace:
[ 244.509048] <IRQ>
[ 244.509100] _rtw_join_timeout_handler+0x134/0x170 [r8723bs]
[ 244.509468] ? __pfx__rtw_join_timeout_handler+0x10/0x10 [r8723bs]
[ 244.509772] ? __pfx__rtw_join_timeout_handler+0x10/0x10 [r8723bs]
[ 244.510076] call_timer_fn+0x95/0x2a0
[ 244.510200] __run_timers.part.0+0x1da/0x2d0
This oops is causd by the switch to spin_[un]lock_irq() which disables
the IRQs for the entire duration of _rtw_join_timeout_handler().
Disabling the IRQs is not necessary since all code taking this lock
runs from either user contexts or from softirqs, switch back to
spin_[un]lock_bh() to fix this. |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: cfg80211: Add missing lock in cfg80211_check_and_end_cac()
Callers of wdev_chandef() must hold the wiphy mutex.
But the worker cfg80211_propagate_cac_done_wk() never takes the lock.
Which triggers the warning below with the mesh_peer_connected_dfs
test from hostapd and not (yet) released mac80211 code changes:
WARNING: CPU: 0 PID: 495 at net/wireless/chan.c:1552 wdev_chandef+0x60/0x165
Modules linked in:
CPU: 0 UID: 0 PID: 495 Comm: kworker/u4:2 Not tainted 6.14.0-rc5-wt-g03960e6f9d47 #33 13c287eeabfe1efea01c0bcc863723ab082e17cf
Workqueue: cfg80211 cfg80211_propagate_cac_done_wk
Stack:
00000000 00000001 ffffff00 6093267c
00000000 6002ec30 6d577c50 60037608
00000000 67e8d108 6063717b 00000000
Call Trace:
[<6002ec30>] ? _printk+0x0/0x98
[<6003c2b3>] show_stack+0x10e/0x11a
[<6002ec30>] ? _printk+0x0/0x98
[<60037608>] dump_stack_lvl+0x71/0xb8
[<6063717b>] ? wdev_chandef+0x60/0x165
[<6003766d>] dump_stack+0x1e/0x20
[<6005d1b7>] __warn+0x101/0x20f
[<6005d3a8>] warn_slowpath_fmt+0xe3/0x15d
[<600b0c5c>] ? mark_lock.part.0+0x0/0x4ec
[<60751191>] ? __this_cpu_preempt_check+0x0/0x16
[<600b11a2>] ? mark_held_locks+0x5a/0x6e
[<6005d2c5>] ? warn_slowpath_fmt+0x0/0x15d
[<60052e53>] ? unblock_signals+0x3a/0xe7
[<60052f2d>] ? um_set_signals+0x2d/0x43
[<60751191>] ? __this_cpu_preempt_check+0x0/0x16
[<607508b2>] ? lock_is_held_type+0x207/0x21f
[<6063717b>] wdev_chandef+0x60/0x165
[<605f89b4>] regulatory_propagate_dfs_state+0x247/0x43f
[<60052f00>] ? um_set_signals+0x0/0x43
[<605e6bfd>] cfg80211_propagate_cac_done_wk+0x3a/0x4a
[<6007e460>] process_scheduled_works+0x3bc/0x60e
[<6007d0ec>] ? move_linked_works+0x4d/0x81
[<6007d120>] ? assign_work+0x0/0xaa
[<6007f81f>] worker_thread+0x220/0x2dc
[<600786ef>] ? set_pf_worker+0x0/0x57
[<60087c96>] ? to_kthread+0x0/0x43
[<6008ab3c>] kthread+0x2d3/0x2e2
[<6007f5ff>] ? worker_thread+0x0/0x2dc
[<6006c05b>] ? calculate_sigpending+0x0/0x56
[<6003b37d>] new_thread_handler+0x4a/0x64
irq event stamp: 614611
hardirqs last enabled at (614621): [<00000000600bc96b>] __up_console_sem+0x82/0xaf
hardirqs last disabled at (614630): [<00000000600bc92c>] __up_console_sem+0x43/0xaf
softirqs last enabled at (614268): [<00000000606c55c6>] __ieee80211_wake_queue+0x933/0x985
softirqs last disabled at (614266): [<00000000606c52d6>] __ieee80211_wake_queue+0x643/0x985 |
| In the Linux kernel, the following vulnerability has been resolved:
net: hinic: avoid kernel hung in hinic_get_stats64()
When using hinic device as a bond slave device, and reading device stats
of master bond device, the kernel may hung.
The kernel panic calltrace as follows:
Kernel panic - not syncing: softlockup: hung tasks
Call trace:
native_queued_spin_lock_slowpath+0x1ec/0x31c
dev_get_stats+0x60/0xcc
dev_seq_printf_stats+0x40/0x120
dev_seq_show+0x1c/0x40
seq_read_iter+0x3c8/0x4dc
seq_read+0xe0/0x130
proc_reg_read+0xa8/0xe0
vfs_read+0xb0/0x1d4
ksys_read+0x70/0xfc
__arm64_sys_read+0x20/0x30
el0_svc_common+0x88/0x234
do_el0_svc+0x2c/0x90
el0_svc+0x1c/0x30
el0_sync_handler+0xa8/0xb0
el0_sync+0x148/0x180
And the calltrace of task that actually caused kernel hungs as follows:
__switch_to+124
__schedule+548
schedule+72
schedule_timeout+348
__down_common+188
__down+24
down+104
hinic_get_stats64+44 [hinic]
dev_get_stats+92
bond_get_stats+172 [bonding]
dev_get_stats+92
dev_seq_printf_stats+60
dev_seq_show+24
seq_read_iter+964
seq_read+220
proc_reg_read+164
vfs_read+172
ksys_read+108
__arm64_sys_read+28
el0_svc_common+132
do_el0_svc+40
el0_svc+24
el0_sync_handler+164
el0_sync+324
When getting device stats from bond, kernel will call bond_get_stats().
It first holds the spinlock bond->stats_lock, and then call
hinic_get_stats64() to collect hinic device's stats.
However, hinic_get_stats64() calls `down(&nic_dev->mgmt_lock)` to
protect its critical section, which may schedule current task out.
And if system is under high pressure, the task cannot be woken up
immediately, which eventually triggers kernel hung panic.
Since previous patch has replaced hinic_dev.tx_stats/rx_stats with local
variable in hinic_get_stats64(), there is nothing need to be protected
by lock, so just removing down()/up() is ok. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/msm/mdp5: Fix global state lock backoff
We need to grab the lock after the early return for !hwpipe case.
Otherwise, we could have hit contention yet still returned 0.
Fixes an issue that the new CONFIG_DRM_DEBUG_MODESET_LOCK stuff flagged
in CI:
WARNING: CPU: 0 PID: 282 at drivers/gpu/drm/drm_modeset_lock.c:296 drm_modeset_lock+0xf8/0x154
Modules linked in:
CPU: 0 PID: 282 Comm: kms_cursor_lega Tainted: G W 5.19.0-rc2-15930-g875cc8bc536a #1
Hardware name: Qualcomm Technologies, Inc. DB820c (DT)
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : drm_modeset_lock+0xf8/0x154
lr : drm_atomic_get_private_obj_state+0x84/0x170
sp : ffff80000cfab6a0
x29: ffff80000cfab6a0 x28: 0000000000000000 x27: ffff000083bc4d00
x26: 0000000000000038 x25: 0000000000000000 x24: ffff80000957ca58
x23: 0000000000000000 x22: ffff000081ace080 x21: 0000000000000001
x20: ffff000081acec18 x19: ffff80000cfabb80 x18: 0000000000000038
x17: 0000000000000000 x16: 0000000000000000 x15: fffffffffffea0d0
x14: 0000000000000000 x13: 284e4f5f4e524157 x12: 5f534b434f4c5f47
x11: ffff80000a386aa8 x10: 0000000000000029 x9 : ffff80000cfab610
x8 : 0000000000000029 x7 : 0000000000000014 x6 : 0000000000000000
x5 : 0000000000000001 x4 : ffff8000081ad904 x3 : 0000000000000029
x2 : ffff0000801db4c0 x1 : ffff80000cfabb80 x0 : ffff000081aceb58
Call trace:
drm_modeset_lock+0xf8/0x154
drm_atomic_get_private_obj_state+0x84/0x170
mdp5_get_global_state+0x54/0x6c
mdp5_pipe_release+0x2c/0xd4
mdp5_plane_atomic_check+0x2ec/0x414
drm_atomic_helper_check_planes+0xd8/0x210
drm_atomic_helper_check+0x54/0xb0
...
---[ end trace 0000000000000000 ]---
drm_modeset_lock attempting to lock a contended lock without backoff:
drm_modeset_lock+0x148/0x154
mdp5_get_global_state+0x30/0x6c
mdp5_pipe_release+0x2c/0xd4
mdp5_plane_atomic_check+0x290/0x414
drm_atomic_helper_check_planes+0xd8/0x210
drm_atomic_helper_check+0x54/0xb0
drm_atomic_check_only+0x4b0/0x8f4
drm_atomic_commit+0x68/0xe0
Patchwork: https://patchwork.freedesktop.org/patch/492701/ |
| In the Linux kernel, the following vulnerability has been resolved:
net: hibmcge: fix rtnl deadlock issue
Currently, the hibmcge netdev acquires the rtnl_lock in
pci_error_handlers.reset_prepare() and releases it in
pci_error_handlers.reset_done().
However, in the PCI framework:
pci_reset_bus - __pci_reset_slot - pci_slot_save_and_disable_locked -
pci_dev_save_and_disable - err_handler->reset_prepare(dev);
In pci_slot_save_and_disable_locked():
list_for_each_entry(dev, &slot->bus->devices, bus_list) {
if (!dev->slot || dev->slot!= slot)
continue;
pci_dev_save_and_disable(dev);
if (dev->subordinate)
pci_bus_save_and_disable_locked(dev->subordinate);
}
This will iterate through all devices under the current bus and execute
err_handler->reset_prepare(), causing two devices of the hibmcge driver
to sequentially request the rtnl_lock, leading to a deadlock.
Since the driver now executes netif_device_detach()
before the reset process, it will not concurrently with
other netdev APIs, so there is no need to hold the rtnl_lock now.
Therefore, this patch removes the rtnl_lock during the reset process and
adjusts the position of HBG_NIC_STATE_RESETTING to ensure
that multiple resets are not executed concurrently. |
| In the Linux kernel, the following vulnerability has been resolved:
media: mt9m114: Fix deadlock in get_frame_interval/set_frame_interval
Getting / Setting the frame interval using the V4L2 subdev pad ops
get_frame_interval/set_frame_interval causes a deadlock, as the
subdev state is locked in the [1] but also in the driver itself.
In [2] it's described that the caller is responsible to acquire and
release the lock in this case. Therefore, acquiring the lock in the
driver is wrong.
Remove the lock acquisitions/releases from mt9m114_ifp_get_frame_interval()
and mt9m114_ifp_set_frame_interval().
[1] drivers/media/v4l2-core/v4l2-subdev.c - line 1129
[2] Documentation/driver-api/media/v4l2-subdev.rst |
| In the Linux kernel, the following vulnerability has been resolved:
LoongArch: Optimize module load time by optimizing PLT/GOT counting
When enabling CONFIG_KASAN, CONFIG_PREEMPT_VOLUNTARY_BUILD and
CONFIG_PREEMPT_VOLUNTARY at the same time, there will be soft deadlock,
the relevant logs are as follows:
rcu: INFO: rcu_sched self-detected stall on CPU
...
Call Trace:
[<900000000024f9e4>] show_stack+0x5c/0x180
[<90000000002482f4>] dump_stack_lvl+0x94/0xbc
[<9000000000224544>] rcu_dump_cpu_stacks+0x1fc/0x280
[<900000000037ac80>] rcu_sched_clock_irq+0x720/0xf88
[<9000000000396c34>] update_process_times+0xb4/0x150
[<90000000003b2474>] tick_nohz_handler+0xf4/0x250
[<9000000000397e28>] __hrtimer_run_queues+0x1d0/0x428
[<9000000000399b2c>] hrtimer_interrupt+0x214/0x538
[<9000000000253634>] constant_timer_interrupt+0x64/0x80
[<9000000000349938>] __handle_irq_event_percpu+0x78/0x1a0
[<9000000000349a78>] handle_irq_event_percpu+0x18/0x88
[<9000000000354c00>] handle_percpu_irq+0x90/0xf0
[<9000000000348c74>] handle_irq_desc+0x94/0xb8
[<9000000001012b28>] handle_cpu_irq+0x68/0xa0
[<9000000001def8c0>] handle_loongarch_irq+0x30/0x48
[<9000000001def958>] do_vint+0x80/0xd0
[<9000000000268a0c>] kasan_mem_to_shadow.part.0+0x2c/0x2a0
[<90000000006344f4>] __asan_load8+0x4c/0x120
[<900000000025c0d0>] module_frob_arch_sections+0x5c8/0x6b8
[<90000000003895f0>] load_module+0x9e0/0x2958
[<900000000038b770>] __do_sys_init_module+0x208/0x2d0
[<9000000001df0c34>] do_syscall+0x94/0x190
[<900000000024d6fc>] handle_syscall+0xbc/0x158
After analysis, this is because the slow speed of loading the amdgpu
module leads to the long time occupation of the cpu and then the soft
deadlock.
When loading a module, module_frob_arch_sections() tries to figure out
the number of PLTs/GOTs that will be needed to handle all the RELAs. It
will call the count_max_entries() to find in an out-of-order date which
counting algorithm has O(n^2) complexity.
To make it faster, we sort the relocation list by info and addend. That
way, to check for a duplicate relocation, it just needs to compare with
the previous entry. This reduces the complexity of the algorithm to O(n
log n), as done in commit d4e0340919fb ("arm64/module: Optimize module
load time by optimizing PLT counting"). This gives sinificant reduction
in module load time for modules with large number of relocations.
After applying this patch, the soft deadlock problem has been solved,
and the kernel starts normally without "Call Trace".
Using the default configuration to test some modules, the results are as
follows:
Module Size
ip_tables 36K
fat 143K
radeon 2.5MB
amdgpu 16MB
Without this patch:
Module Module load time (ms) Count(PLTs/GOTs)
ip_tables 18 59/6
fat 0 162/14
radeon 54 1221/84
amdgpu 1411 4525/1098
With this patch:
Module Module load time (ms) Count(PLTs/GOTs)
ip_tables 18 59/6
fat 0 162/14
radeon 22 1221/84
amdgpu 45 4525/1098 |
| In the Linux kernel, the following vulnerability has been resolved:
bnxt_en: Fix lockdep warning during rmmod
The commit under the Fixes tag added a netdev_assert_locked() in
bnxt_free_ntp_fltrs(). The lock should be held during normal run-time
but the assert will be triggered (see below) during bnxt_remove_one()
which should not need the lock. The netdev is already unregistered by
then. Fix it by calling netdev_assert_locked_or_invisible() which will
not assert if the netdev is unregistered.
WARNING: CPU: 5 PID: 2241 at ./include/net/netdev_lock.h:17 bnxt_free_ntp_fltrs+0xf8/0x100 [bnxt_en]
Modules linked in: rpcrdma rdma_cm iw_cm ib_cm configfs ib_core bnxt_en(-) bridge stp llc x86_pkg_temp_thermal xfs tg3 [last unloaded: bnxt_re]
CPU: 5 UID: 0 PID: 2241 Comm: rmmod Tainted: G S W 6.16.0 #2 PREEMPT(voluntary)
Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017
RIP: 0010:bnxt_free_ntp_fltrs+0xf8/0x100 [bnxt_en]
Code: 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 8b 47 60 be ff ff ff ff 48 8d b8 28 0c 00 00 e8 d0 cf 41 c3 85 c0 0f 85 2e ff ff ff <0f> 0b e9 27 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 0018:ffffa92082387da0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff9e5b593d8000 RCX: 0000000000000001
RDX: 0000000000000001 RSI: ffffffff83dc9a70 RDI: ffffffff83e1a1cf
RBP: ffff9e5b593d8c80 R08: 0000000000000000 R09: ffffffff8373a2b3
R10: 000000008100009f R11: 0000000000000001 R12: 0000000000000001
R13: ffffffffc01c4478 R14: dead000000000122 R15: dead000000000100
FS: 00007f3a8a52c740(0000) GS:ffff9e631ad1c000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055bb289419c8 CR3: 000000011274e001 CR4: 00000000003706f0
Call Trace:
<TASK>
bnxt_remove_one+0x57/0x180 [bnxt_en]
pci_device_remove+0x39/0xc0
device_release_driver_internal+0xa5/0x130
driver_detach+0x42/0x90
bus_remove_driver+0x61/0xc0
pci_unregister_driver+0x38/0x90
bnxt_exit+0xc/0x7d0 [bnxt_en] |
| In the Linux kernel, the following vulnerability has been resolved:
dm: dm-crypt: Do not partially accept write BIOs with zoned targets
Read and write operations issued to a dm-crypt target may be split
according to the dm-crypt internal limits defined by the max_read_size
and max_write_size module parameters (default is 128 KB). The intent is
to improve processing time of large BIOs by splitting them into smaller
operations that can be parallelized on different CPUs.
For zoned dm-crypt targets, this BIO splitting is still done but without
the parallel execution to ensure that the issuing order of write
operations to the underlying devices remains sequential. However, the
splitting itself causes other problems:
1) Since dm-crypt relies on the block layer zone write plugging to
handle zone append emulation using regular write operations, the
reminder of a split write BIO will always be plugged into the target
zone write plugged. Once the on-going write BIO finishes, this
reminder BIO is unplugged and issued from the zone write plug work.
If this reminder BIO itself needs to be split, the reminder will be
re-issued and plugged again, but that causes a call to a
blk_queue_enter(), which may block if a queue freeze operation was
initiated. This results in a deadlock as DM submission still holds
BIOs that the queue freeze side is waiting for.
2) dm-crypt relies on the emulation done by the block layer using
regular write operations for processing zone append operations. This
still requires to properly return the written sector as the BIO
sector of the original BIO. However, this can be done correctly only
and only if there is a single clone BIO used for processing the
original zone append operation issued by the user. If the size of a
zone append operation is larger than dm-crypt max_write_size, then
the orginal BIO will be split and processed as a chain of regular
write operations. Such chaining result in an incorrect written sector
being returned to the zone append issuer using the original BIO
sector. This in turn results in file system data corruptions using
xfs or btrfs.
Fix this by modifying get_max_request_size() to always return the size
of the BIO to avoid it being split with dm_accpet_partial_bio() in
crypt_map(). get_max_request_size() is renamed to
get_max_request_sectors() to clarify the unit of the value returned
and its interface is changed to take a struct dm_target pointer and a
pointer to the struct bio being processed. In addition to this change,
to ensure that crypt_alloc_buffer() works correctly, set the dm-crypt
device max_hw_sectors limit to be at most
BIO_MAX_VECS << PAGE_SECTORS_SHIFT (1 MB with a 4KB page architecture).
This forces DM core to split write BIOs before passing them to
crypt_map(), and thus guaranteeing that dm-crypt can always accept an
entire write BIO without needing to split it.
This change does not have any effect on the read path of dm-crypt. Read
operations can still be split and the BIO fragments processed in
parallel. There is also no impact on the performance of the write path
given that all zone write BIOs were already processed inline instead of
in parallel.
This change also does not affect in any way regular dm-crypt block
devices. |
| In the Linux kernel, the following vulnerability has been resolved:
ext4: avoid deadlock in fs reclaim with page writeback
Ext4 has a filesystem wide lock protecting ext4_writepages() calls to
avoid races with switching of journalled data flag or inode format. This
lock can however cause a deadlock like:
CPU0 CPU1
ext4_writepages()
percpu_down_read(sbi->s_writepages_rwsem);
ext4_change_inode_journal_flag()
percpu_down_write(sbi->s_writepages_rwsem);
- blocks, all readers block from now on
ext4_do_writepages()
ext4_init_io_end()
kmem_cache_zalloc(io_end_cachep, GFP_KERNEL)
fs_reclaim frees dentry...
dentry_unlink_inode()
iput() - last ref =>
iput_final() - inode dirty =>
write_inode_now()...
ext4_writepages() tries to acquire sbi->s_writepages_rwsem
and blocks forever
Make sure we cannot recurse into filesystem reclaim from writeback code
to avoid the deadlock. |
| In the Linux kernel, the following vulnerability has been resolved:
e1000: Move cancel_work_sync to avoid deadlock
Previously, e1000_down called cancel_work_sync for the e1000 reset task
(via e1000_down_and_stop), which takes RTNL.
As reported by users and syzbot, a deadlock is possible in the following
scenario:
CPU 0:
- RTNL is held
- e1000_close
- e1000_down
- cancel_work_sync (cancel / wait for e1000_reset_task())
CPU 1:
- process_one_work
- e1000_reset_task
- take RTNL
To remedy this, avoid calling cancel_work_sync from e1000_down
(e1000_reset_task does nothing if the device is down anyway). Instead,
call cancel_work_sync for e1000_reset_task when the device is being
removed. |
| In the Linux kernel, the following vulnerability has been resolved:
smb: client: fix potential deadlock when reconnecting channels
Fix cifs_signal_cifsd_for_reconnect() to take the correct lock order
and prevent the following deadlock from happening
======================================================
WARNING: possible circular locking dependency detected
6.16.0-rc3-build2+ #1301 Tainted: G S W
------------------------------------------------------
cifsd/6055 is trying to acquire lock:
ffff88810ad56038 (&tcp_ses->srv_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x134/0x200
but task is already holding lock:
ffff888119c64330 (&ret_buf->chan_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0xcf/0x200
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&ret_buf->chan_lock){+.+.}-{3:3}:
validate_chain+0x1cf/0x270
__lock_acquire+0x60e/0x780
lock_acquire.part.0+0xb4/0x1f0
_raw_spin_lock+0x2f/0x40
cifs_setup_session+0x81/0x4b0
cifs_get_smb_ses+0x771/0x900
cifs_mount_get_session+0x7e/0x170
cifs_mount+0x92/0x2d0
cifs_smb3_do_mount+0x161/0x460
smb3_get_tree+0x55/0x90
vfs_get_tree+0x46/0x180
do_new_mount+0x1b0/0x2e0
path_mount+0x6ee/0x740
do_mount+0x98/0xe0
__do_sys_mount+0x148/0x180
do_syscall_64+0xa4/0x260
entry_SYSCALL_64_after_hwframe+0x76/0x7e
-> #1 (&ret_buf->ses_lock){+.+.}-{3:3}:
validate_chain+0x1cf/0x270
__lock_acquire+0x60e/0x780
lock_acquire.part.0+0xb4/0x1f0
_raw_spin_lock+0x2f/0x40
cifs_match_super+0x101/0x320
sget+0xab/0x270
cifs_smb3_do_mount+0x1e0/0x460
smb3_get_tree+0x55/0x90
vfs_get_tree+0x46/0x180
do_new_mount+0x1b0/0x2e0
path_mount+0x6ee/0x740
do_mount+0x98/0xe0
__do_sys_mount+0x148/0x180
do_syscall_64+0xa4/0x260
entry_SYSCALL_64_after_hwframe+0x76/0x7e
-> #0 (&tcp_ses->srv_lock){+.+.}-{3:3}:
check_noncircular+0x95/0xc0
check_prev_add+0x115/0x2f0
validate_chain+0x1cf/0x270
__lock_acquire+0x60e/0x780
lock_acquire.part.0+0xb4/0x1f0
_raw_spin_lock+0x2f/0x40
cifs_signal_cifsd_for_reconnect+0x134/0x200
__cifs_reconnect+0x8f/0x500
cifs_handle_standard+0x112/0x280
cifs_demultiplex_thread+0x64d/0xbc0
kthread+0x2f7/0x310
ret_from_fork+0x2a/0x230
ret_from_fork_asm+0x1a/0x30
other info that might help us debug this:
Chain exists of:
&tcp_ses->srv_lock --> &ret_buf->ses_lock --> &ret_buf->chan_lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&ret_buf->chan_lock);
lock(&ret_buf->ses_lock);
lock(&ret_buf->chan_lock);
lock(&tcp_ses->srv_lock);
*** DEADLOCK ***
3 locks held by cifsd/6055:
#0: ffffffff857de398 (&cifs_tcp_ses_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x7b/0x200
#1: ffff888119c64060 (&ret_buf->ses_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x9c/0x200
#2: ffff888119c64330 (&ret_buf->chan_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0xcf/0x200 |