| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
mptcp: fix race in mptcp_pm_nl_flush_addrs_doit()
syzbot and Eulgyu Kim reported crashes in mptcp_pm_nl_get_local_id()
and/or mptcp_pm_nl_is_backup()
Root cause is list_splice_init() in mptcp_pm_nl_flush_addrs_doit()
which is not RCU ready.
list_splice_init_rcu() can not be called here while holding pernet->lock
spinlock.
Many thanks to Eulgyu Kim for providing a repro and testing our patches. |
| In the Linux kernel, the following vulnerability has been resolved:
mm/hugetlb: fix hugetlb_pmd_shared()
Patch series "mm/hugetlb: fixes for PMD table sharing (incl. using
mmu_gather)", v3.
One functional fix, one performance regression fix, and two related
comment fixes.
I cleaned up my prototype I recently shared [1] for the performance fix,
deferring most of the cleanups I had in the prototype to a later point.
While doing that I identified the other things.
The goal of this patch set is to be backported to stable trees "fairly"
easily. At least patch #1 and #4.
Patch #1 fixes hugetlb_pmd_shared() not detecting any sharing
Patch #2 + #3 are simple comment fixes that patch #4 interacts with.
Patch #4 is a fix for the reported performance regression due to excessive
IPI broadcasts during fork()+exit().
The last patch is all about TLB flushes, IPIs and mmu_gather.
Read: complicated
There are plenty of cleanups in the future to be had + one reasonable
optimization on x86. But that's all out of scope for this series.
Runtime tested, with a focus on fixing the performance regression using
the original reproducer [2] on x86.
This patch (of 4):
We switched from (wrongly) using the page count to an independent shared
count. Now, shared page tables have a refcount of 1 (excluding
speculative references) and instead use ptdesc->pt_share_count to identify
sharing.
We didn't convert hugetlb_pmd_shared(), so right now, we would never
detect a shared PMD table as such, because sharing/unsharing no longer
touches the refcount of a PMD table.
Page migration, like mbind() or migrate_pages() would allow for migrating
folios mapped into such shared PMD tables, even though the folios are not
exclusive. In smaps we would account them as "private" although they are
"shared", and we would be wrongly setting the PM_MMAP_EXCLUSIVE in the
pagemap interface.
Fix it by properly using ptdesc_pmd_is_shared() in hugetlb_pmd_shared(). |
| In the Linux kernel, the following vulnerability has been resolved:
mptcp: ensure context reset on disconnect()
After the blamed commit below, if the MPC subflow is already in TCP_CLOSE
status or has fallback to TCP at mptcp_disconnect() time,
mptcp_do_fastclose() skips setting the `send_fastclose flag` and the later
__mptcp_close_ssk() does not reset anymore the related subflow context.
Any later connection will be created with both the `request_mptcp` flag
and the msk-level fallback status off (it is unconditionally cleared at
MPTCP disconnect time), leading to a warning in subflow_data_ready():
WARNING: CPU: 26 PID: 8996 at net/mptcp/subflow.c:1519 subflow_data_ready (net/mptcp/subflow.c:1519 (discriminator 13))
Modules linked in:
CPU: 26 UID: 0 PID: 8996 Comm: syz.22.39 Not tainted 6.18.0-rc7-05427-g11fc074f6c36 #1 PREEMPT(voluntary)
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
RIP: 0010:subflow_data_ready (net/mptcp/subflow.c:1519 (discriminator 13))
Code: 90 0f 0b 90 90 e9 04 fe ff ff e8 b7 1e f5 fe 89 ee bf 07 00 00 00 e8 db 19 f5 fe 83 fd 07 0f 84 35 ff ff ff e8 9d 1e f5 fe 90 <0f> 0b 90 e9 27 ff ff ff e8 8f 1e f5 fe 4c 89 e7 48 89 de e8 14 09
RSP: 0018:ffffc9002646fb30 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff88813b218000 RCX: ffffffff825c8435
RDX: ffff8881300b3580 RSI: ffffffff825c8443 RDI: 0000000000000005
RBP: 000000000000000b R08: ffffffff825c8435 R09: 000000000000000b
R10: 0000000000000005 R11: 0000000000000007 R12: ffff888131ac0000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 00007f88330af6c0(0000) GS:ffff888a93dd2000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f88330aefe8 CR3: 000000010ff59000 CR4: 0000000000350ef0
Call Trace:
<TASK>
tcp_data_ready (net/ipv4/tcp_input.c:5356)
tcp_data_queue (net/ipv4/tcp_input.c:5445)
tcp_rcv_state_process (net/ipv4/tcp_input.c:7165)
tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1955)
__release_sock (include/net/sock.h:1158 (discriminator 6) net/core/sock.c:3180 (discriminator 6))
release_sock (net/core/sock.c:3737)
mptcp_sendmsg (net/mptcp/protocol.c:1763 net/mptcp/protocol.c:1857)
inet_sendmsg (net/ipv4/af_inet.c:853 (discriminator 7))
__sys_sendto (net/socket.c:727 (discriminator 15) net/socket.c:742 (discriminator 15) net/socket.c:2244 (discriminator 15))
__x64_sys_sendto (net/socket.c:2247)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1))
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
RIP: 0033:0x7f883326702d
Address the issue setting an explicit `fastclosing` flag at fastclose
time, and checking such flag after mptcp_do_fastclose(). |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix racy bitfield write in btrfs_clear_space_info_full()
From the memory-barriers.txt document regarding memory barrier ordering
guarantees:
(*) These guarantees do not apply to bitfields, because compilers often
generate code to modify these using non-atomic read-modify-write
sequences. Do not attempt to use bitfields to synchronize parallel
algorithms.
(*) Even in cases where bitfields are protected by locks, all fields
in a given bitfield must be protected by one lock. If two fields
in a given bitfield are protected by different locks, the compiler's
non-atomic read-modify-write sequences can cause an update to one
field to corrupt the value of an adjacent field.
btrfs_space_info has a bitfield sharing an underlying word consisting of
the fields full, chunk_alloc, and flush:
struct btrfs_space_info {
struct btrfs_fs_info * fs_info; /* 0 8 */
struct btrfs_space_info * parent; /* 8 8 */
...
int clamp; /* 172 4 */
unsigned int full:1; /* 176: 0 4 */
unsigned int chunk_alloc:1; /* 176: 1 4 */
unsigned int flush:1; /* 176: 2 4 */
...
Therefore, to be safe from parallel read-modify-writes losing a write to
one of the bitfield members protected by a lock, all writes to all the
bitfields must use the lock. They almost universally do, except for
btrfs_clear_space_info_full() which iterates over the space_infos and
writes out found->full = 0 without a lock.
Imagine that we have one thread completing a transaction in which we
finished deleting a block_group and are thus calling
btrfs_clear_space_info_full() while simultaneously the data reclaim
ticket infrastructure is running do_async_reclaim_data_space():
T1 T2
btrfs_commit_transaction
btrfs_clear_space_info_full
data_sinfo->full = 0
READ: full:0, chunk_alloc:0, flush:1
do_async_reclaim_data_space(data_sinfo)
spin_lock(&space_info->lock);
if(list_empty(tickets))
space_info->flush = 0;
READ: full: 0, chunk_alloc:0, flush:1
MOD/WRITE: full: 0, chunk_alloc:0, flush:0
spin_unlock(&space_info->lock);
return;
MOD/WRITE: full:0, chunk_alloc:0, flush:1
and now data_sinfo->flush is 1 but the reclaim worker has exited. This
breaks the invariant that flush is 0 iff there is no work queued or
running. Once this invariant is violated, future allocations that go
into __reserve_bytes() will add tickets to space_info->tickets but will
see space_info->flush is set to 1 and not queue the work. After this,
they will block forever on the resulting ticket, as it is now impossible
to kick the worker again.
I also confirmed by looking at the assembly of the affected kernel that
it is doing RMW operations. For example, to set the flush (3rd) bit to 0,
the assembly is:
andb $0xfb,0x60(%rbx)
and similarly for setting the full (1st) bit to 0:
andb $0xfe,-0x20(%rax)
So I think this is really a bug on practical systems. I have observed
a number of systems in this exact state, but am currently unable to
reproduce it.
Rather than leaving this footgun lying around for the future, take
advantage of the fact that there is room in the struct anyway, and that
it is already quite large and simply change the three bitfield members to
bools. This avoids writes to space_info->full having any effect on
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
ACPI: APEI: send SIGBUS to current task if synchronous memory error not recovered
If a synchronous error is detected as a result of user-space process
triggering a 2-bit uncorrected error, the CPU will take a synchronous
error exception such as Synchronous External Abort (SEA) on Arm64. The
kernel will queue a memory_failure() work which poisons the related
page, unmaps the page, and then sends a SIGBUS to the process, so that
a system wide panic can be avoided.
However, no memory_failure() work will be queued when abnormal
synchronous errors occur. These errors can include situations like
invalid PA, unexpected severity, no memory failure config support,
invalid GUID section, etc. In such a case, the user-space process will
trigger SEA again. This loop can potentially exceed the platform
firmware threshold or even trigger a kernel hard lockup, leading to a
system reboot.
Fix it by performing a force kill if no memory_failure() work is queued
for synchronous errors.
[ rjw: Changelog edits ] |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: cfg80211: Add missing lock in cfg80211_check_and_end_cac()
Callers of wdev_chandef() must hold the wiphy mutex.
But the worker cfg80211_propagate_cac_done_wk() never takes the lock.
Which triggers the warning below with the mesh_peer_connected_dfs
test from hostapd and not (yet) released mac80211 code changes:
WARNING: CPU: 0 PID: 495 at net/wireless/chan.c:1552 wdev_chandef+0x60/0x165
Modules linked in:
CPU: 0 UID: 0 PID: 495 Comm: kworker/u4:2 Not tainted 6.14.0-rc5-wt-g03960e6f9d47 #33 13c287eeabfe1efea01c0bcc863723ab082e17cf
Workqueue: cfg80211 cfg80211_propagate_cac_done_wk
Stack:
00000000 00000001 ffffff00 6093267c
00000000 6002ec30 6d577c50 60037608
00000000 67e8d108 6063717b 00000000
Call Trace:
[<6002ec30>] ? _printk+0x0/0x98
[<6003c2b3>] show_stack+0x10e/0x11a
[<6002ec30>] ? _printk+0x0/0x98
[<60037608>] dump_stack_lvl+0x71/0xb8
[<6063717b>] ? wdev_chandef+0x60/0x165
[<6003766d>] dump_stack+0x1e/0x20
[<6005d1b7>] __warn+0x101/0x20f
[<6005d3a8>] warn_slowpath_fmt+0xe3/0x15d
[<600b0c5c>] ? mark_lock.part.0+0x0/0x4ec
[<60751191>] ? __this_cpu_preempt_check+0x0/0x16
[<600b11a2>] ? mark_held_locks+0x5a/0x6e
[<6005d2c5>] ? warn_slowpath_fmt+0x0/0x15d
[<60052e53>] ? unblock_signals+0x3a/0xe7
[<60052f2d>] ? um_set_signals+0x2d/0x43
[<60751191>] ? __this_cpu_preempt_check+0x0/0x16
[<607508b2>] ? lock_is_held_type+0x207/0x21f
[<6063717b>] wdev_chandef+0x60/0x165
[<605f89b4>] regulatory_propagate_dfs_state+0x247/0x43f
[<60052f00>] ? um_set_signals+0x0/0x43
[<605e6bfd>] cfg80211_propagate_cac_done_wk+0x3a/0x4a
[<6007e460>] process_scheduled_works+0x3bc/0x60e
[<6007d0ec>] ? move_linked_works+0x4d/0x81
[<6007d120>] ? assign_work+0x0/0xaa
[<6007f81f>] worker_thread+0x220/0x2dc
[<600786ef>] ? set_pf_worker+0x0/0x57
[<60087c96>] ? to_kthread+0x0/0x43
[<6008ab3c>] kthread+0x2d3/0x2e2
[<6007f5ff>] ? worker_thread+0x0/0x2dc
[<6006c05b>] ? calculate_sigpending+0x0/0x56
[<6003b37d>] new_thread_handler+0x4a/0x64
irq event stamp: 614611
hardirqs last enabled at (614621): [<00000000600bc96b>] __up_console_sem+0x82/0xaf
hardirqs last disabled at (614630): [<00000000600bc92c>] __up_console_sem+0x43/0xaf
softirqs last enabled at (614268): [<00000000606c55c6>] __ieee80211_wake_queue+0x933/0x985
softirqs last disabled at (614266): [<00000000606c52d6>] __ieee80211_wake_queue+0x643/0x985 |
| In the Linux kernel, the following vulnerability has been resolved:
xsk: Fix race condition in AF_XDP generic RX path
Move rx_lock from xsk_socket to xsk_buff_pool.
Fix synchronization for shared umem mode in
generic RX path where multiple sockets share
single xsk_buff_pool.
RX queue is exclusive to xsk_socket, while FILL
queue can be shared between multiple sockets.
This could result in race condition where two
CPU cores access RX path of two different sockets
sharing the same umem.
Protect both queues by acquiring spinlock in shared
xsk_buff_pool.
Lock contention may be minimized in the future by some
per-thread FQ buffering.
It's safe and necessary to move spin_lock_bh(rx_lock)
after xsk_rcv_check():
* xs->pool and spinlock_init is synchronized by
xsk_bind() -> xsk_is_bound() memory barriers.
* xsk_rcv_check() may return true at the moment
of xsk_release() or xsk_unbind_dev(),
however this will not cause any data races or
race conditions. xsk_unbind_dev() removes xdp
socket from all maps and waits for completion
of all outstanding rx operations. Packets in
RX path will either complete safely or drop. |
| In the Linux kernel, the following vulnerability has been resolved:
net: dsa: free routing table on probe failure
If complete = true in dsa_tree_setup(), it means that we are the last
switch of the tree which is successfully probing, and we should be
setting up all switches from our probe path.
After "complete" becomes true, dsa_tree_setup_cpu_ports() or any
subsequent function may fail. If that happens, the entire tree setup is
in limbo: the first N-1 switches have successfully finished probing
(doing nothing but having allocated persistent memory in the tree's
dst->ports, and maybe dst->rtable), and switch N failed to probe, ending
the tree setup process before anything is tangible from the user's PoV.
If switch N fails to probe, its memory (ports) will be freed and removed
from dst->ports. However, the dst->rtable elements pointing to its ports,
as created by dsa_link_touch(), will remain there, and will lead to
use-after-free if dereferenced.
If dsa_tree_setup_switches() returns -EPROBE_DEFER, which is entirely
possible because that is where ds->ops->setup() is, we get a kasan
report like this:
==================================================================
BUG: KASAN: slab-use-after-free in mv88e6xxx_setup_upstream_port+0x240/0x568
Read of size 8 at addr ffff000004f56020 by task kworker/u8:3/42
Call trace:
__asan_report_load8_noabort+0x20/0x30
mv88e6xxx_setup_upstream_port+0x240/0x568
mv88e6xxx_setup+0xebc/0x1eb0
dsa_register_switch+0x1af4/0x2ae0
mv88e6xxx_register_switch+0x1b8/0x2a8
mv88e6xxx_probe+0xc4c/0xf60
mdio_probe+0x78/0xb8
really_probe+0x2b8/0x5a8
__driver_probe_device+0x164/0x298
driver_probe_device+0x78/0x258
__device_attach_driver+0x274/0x350
Allocated by task 42:
__kasan_kmalloc+0x84/0xa0
__kmalloc_cache_noprof+0x298/0x490
dsa_switch_touch_ports+0x174/0x3d8
dsa_register_switch+0x800/0x2ae0
mv88e6xxx_register_switch+0x1b8/0x2a8
mv88e6xxx_probe+0xc4c/0xf60
mdio_probe+0x78/0xb8
really_probe+0x2b8/0x5a8
__driver_probe_device+0x164/0x298
driver_probe_device+0x78/0x258
__device_attach_driver+0x274/0x350
Freed by task 42:
__kasan_slab_free+0x48/0x68
kfree+0x138/0x418
dsa_register_switch+0x2694/0x2ae0
mv88e6xxx_register_switch+0x1b8/0x2a8
mv88e6xxx_probe+0xc4c/0xf60
mdio_probe+0x78/0xb8
really_probe+0x2b8/0x5a8
__driver_probe_device+0x164/0x298
driver_probe_device+0x78/0x258
__device_attach_driver+0x274/0x350
The simplest way to fix the bug is to delete the routing table in its
entirety. dsa_tree_setup_routing_table() has no problem in regenerating
it even if we deleted links between ports other than those of switch N,
because dsa_link_touch() first checks whether the port pair already
exists in dst->rtable, allocating if not.
The deletion of the routing table in its entirety already exists in
dsa_tree_teardown(), so refactor that into a function that can also be
called from the tree setup error path.
In my analysis of the commit to blame, it is the one which added
dsa_link elements to dst->rtable. Prior to that, each switch had its own
ds->rtable which is freed when the switch fails to probe. But the tree
is potentially persistent memory. |
| In the Linux kernel, the following vulnerability has been resolved:
net: stmmac: Fix accessing freed irq affinity_hint
In stmmac_request_irq_multi_msi(), a pointer to the stack variable
cpu_mask is passed to irq_set_affinity_hint(). This value is stored in
irq_desc->affinity_hint, but once stmmac_request_irq_multi_msi()
returns, the pointer becomes dangling.
The affinity_hint is exposed via procfs with S_IRUGO permissions,
allowing any unprivileged process to read it. Accessing this stale
pointer can lead to:
- a kernel oops or panic if the referenced memory has been released and
unmapped, or
- leakage of kernel data into userspace if the memory is re-used for
other purposes.
All platforms that use stmmac with PCI MSI (Intel, Loongson, etc) are
affected. |
| In the Linux kernel, the following vulnerability has been resolved:
nfsd: don't ignore the return code of svc_proc_register()
Currently, nfsd_proc_stat_init() ignores the return value of
svc_proc_register(). If the procfile creation fails, then the kernel
will WARN when it tries to remove the entry later.
Fix nfsd_proc_stat_init() to return the same type of pointer as
svc_proc_register(), and fix up nfsd_net_init() to check that and fail
the nfsd_net construction if it occurs.
svc_proc_register() can fail if the dentry can't be allocated, or if an
identical dentry already exists. The second case is pretty unlikely in
the nfsd_net construction codepath, so if this happens, return -ENOMEM. |
| In the Linux kernel, the following vulnerability has been resolved:
ksmbd: set ATTR_CTIME flags when setting mtime
David reported that the new warning from setattr_copy_mgtime is coming
like the following.
[ 113.215316] ------------[ cut here ]------------
[ 113.215974] WARNING: CPU: 1 PID: 31 at fs/attr.c:300 setattr_copy+0x1ee/0x200
[ 113.219192] CPU: 1 UID: 0 PID: 31 Comm: kworker/1:1 Not tainted 6.13.0-rc1+ #234
[ 113.220127] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
[ 113.221530] Workqueue: ksmbd-io handle_ksmbd_work [ksmbd]
[ 113.222220] RIP: 0010:setattr_copy+0x1ee/0x200
[ 113.222833] Code: 24 28 49 8b 44 24 30 48 89 53 58 89 43 6c 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc 48 89 df e8 77 d6 ff ff e9 cd fe ff ff <0f> 0b e9 be fe ff ff 66 0
[ 113.225110] RSP: 0018:ffffaf218010fb68 EFLAGS: 00010202
[ 113.225765] RAX: 0000000000000120 RBX: ffffa446815f8568 RCX: 0000000000000003
[ 113.226667] RDX: ffffaf218010fd38 RSI: ffffa446815f8568 RDI: ffffffff94eb03a0
[ 113.227531] RBP: ffffaf218010fb90 R08: 0000001a251e217d R09: 00000000675259fa
[ 113.228426] R10: 0000000002ba8a6d R11: ffffa4468196c7a8 R12: ffffaf218010fd38
[ 113.229304] R13: 0000000000000120 R14: ffffffff94eb03a0 R15: 0000000000000000
[ 113.230210] FS: 0000000000000000(0000) GS:ffffa44739d00000(0000) knlGS:0000000000000000
[ 113.231215] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 113.232055] CR2: 00007efe0053d27e CR3: 000000000331a000 CR4: 00000000000006b0
[ 113.232926] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 113.233812] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 113.234797] Call Trace:
[ 113.235116] <TASK>
[ 113.235393] ? __warn+0x73/0xd0
[ 113.235802] ? setattr_copy+0x1ee/0x200
[ 113.236299] ? report_bug+0xf3/0x1e0
[ 113.236757] ? handle_bug+0x4d/0x90
[ 113.237202] ? exc_invalid_op+0x13/0x60
[ 113.237689] ? asm_exc_invalid_op+0x16/0x20
[ 113.238185] ? setattr_copy+0x1ee/0x200
[ 113.238692] btrfs_setattr+0x80/0x820 [btrfs]
[ 113.239285] ? get_stack_info_noinstr+0x12/0xf0
[ 113.239857] ? __module_address+0x22/0xa0
[ 113.240368] ? handle_ksmbd_work+0x6e/0x460 [ksmbd]
[ 113.240993] ? __module_text_address+0x9/0x50
[ 113.241545] ? __module_address+0x22/0xa0
[ 113.242033] ? unwind_next_frame+0x10e/0x920
[ 113.242600] ? __pfx_stack_trace_consume_entry+0x10/0x10
[ 113.243268] notify_change+0x2c2/0x4e0
[ 113.243746] ? stack_depot_save_flags+0x27/0x730
[ 113.244339] ? set_file_basic_info+0x130/0x2b0 [ksmbd]
[ 113.244993] set_file_basic_info+0x130/0x2b0 [ksmbd]
[ 113.245613] ? process_scheduled_works+0xbe/0x310
[ 113.246181] ? worker_thread+0x100/0x240
[ 113.246696] ? kthread+0xc8/0x100
[ 113.247126] ? ret_from_fork+0x2b/0x40
[ 113.247606] ? ret_from_fork_asm+0x1a/0x30
[ 113.248132] smb2_set_info+0x63f/0xa70 [ksmbd]
ksmbd is trying to set the atime and mtime via notify_change without also
setting the ctime. so This patch add ATTR_CTIME flags when setting mtime
to avoid a warning. |
| In the Linux kernel, the following vulnerability has been resolved:
smb: client: set correct id, uid and cruid for multiuser automounts
When uid, gid and cruid are not specified, we need to dynamically
set them into the filesystem context used for automounting otherwise
they'll end up reusing the values from the parent mount. |
| In the Linux kernel, the following vulnerability has been resolved:
clk: mediatek: fix of_iomap memory leak
Smatch reports:
drivers/clk/mediatek/clk-mtk.c:583 mtk_clk_simple_probe() warn:
'base' from of_iomap() not released on lines: 496.
This problem was also found in linux-next. In mtk_clk_simple_probe(),
base is not released when handling errors
if clk_data is not existed, which may cause a leak.
So free_base should be added here to release base. |
| In the Linux kernel, the following vulnerability has been resolved:
fs: dlm: fix invalid derefence of sb_lvbptr
I experience issues when putting a lkbsb on the stack and have sb_lvbptr
field to a dangled pointer while not using DLM_LKF_VALBLK. It will crash
with the following kernel message, the dangled pointer is here
0xdeadbeef as example:
[ 102.749317] BUG: unable to handle page fault for address: 00000000deadbeef
[ 102.749320] #PF: supervisor read access in kernel mode
[ 102.749323] #PF: error_code(0x0000) - not-present page
[ 102.749325] PGD 0 P4D 0
[ 102.749332] Oops: 0000 [#1] PREEMPT SMP PTI
[ 102.749336] CPU: 0 PID: 1567 Comm: lock_torture_wr Tainted: G W 5.19.0-rc3+ #1565
[ 102.749343] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.16.0-2.module+el8.7.0+15506+033991b0 04/01/2014
[ 102.749344] RIP: 0010:memcpy_erms+0x6/0x10
[ 102.749353] Code: cc cc cc cc eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe
[ 102.749355] RSP: 0018:ffff97a58145fd08 EFLAGS: 00010202
[ 102.749358] RAX: ffff901778b77070 RBX: 0000000000000000 RCX: 0000000000000040
[ 102.749360] RDX: 0000000000000040 RSI: 00000000deadbeef RDI: ffff901778b77070
[ 102.749362] RBP: ffff97a58145fd10 R08: ffff901760b67a70 R09: 0000000000000001
[ 102.749364] R10: ffff9017008e2cb8 R11: 0000000000000001 R12: ffff901760b67a70
[ 102.749366] R13: ffff901760b78f00 R14: 0000000000000003 R15: 0000000000000001
[ 102.749368] FS: 0000000000000000(0000) GS:ffff901876e00000(0000) knlGS:0000000000000000
[ 102.749372] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 102.749374] CR2: 00000000deadbeef CR3: 000000017c49a004 CR4: 0000000000770ef0
[ 102.749376] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 102.749378] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 102.749379] PKRU: 55555554
[ 102.749381] Call Trace:
[ 102.749382] <TASK>
[ 102.749383] ? send_args+0xb2/0xd0
[ 102.749389] send_common+0xb7/0xd0
[ 102.749395] _unlock_lock+0x2c/0x90
[ 102.749400] unlock_lock.isra.56+0x62/0xa0
[ 102.749405] dlm_unlock+0x21e/0x330
[ 102.749411] ? lock_torture_stats+0x80/0x80 [dlm_locktorture]
[ 102.749416] torture_unlock+0x5a/0x90 [dlm_locktorture]
[ 102.749419] ? preempt_count_sub+0xba/0x100
[ 102.749427] lock_torture_writer+0xbd/0x150 [dlm_locktorture]
[ 102.786186] kthread+0x10a/0x130
[ 102.786581] ? kthread_complete_and_exit+0x20/0x20
[ 102.787156] ret_from_fork+0x22/0x30
[ 102.787588] </TASK>
[ 102.787855] Modules linked in: dlm_locktorture torture rpcsec_gss_krb5 intel_rapl_msr intel_rapl_common kvm_intel iTCO_wdt iTCO_vendor_support kvm vmw_vsock_virtio_transport qxl irqbypass vmw_vsock_virtio_transport_common drm_ttm_helper crc32_pclmul joydev crc32c_intel ttm vsock virtio_scsi virtio_balloon snd_pcm drm_kms_helper virtio_console snd_timer snd drm soundcore syscopyarea i2c_i801 sysfillrect sysimgblt i2c_smbus pcspkr fb_sys_fops lpc_ich serio_raw
[ 102.792536] CR2: 00000000deadbeef
[ 102.792930] ---[ end trace 0000000000000000 ]---
This patch fixes the issue by checking also on DLM_LKF_VALBLK on exflags
is set when copying the lvbptr array instead of if it's just null which
fixes for me the issue.
I think this patch can fix other dlm users as well, depending how they
handle the init, freeing memory handling of sb_lvbptr and don't set
DLM_LKF_VALBLK for some dlm_lock() calls. It might a there could be a
hidden issue all the time. However with checking on DLM_LKF_VALBLK the
user always need to provide a sb_lvbptr non-null value. There might be
more intelligent handling between per ls lvblen, DLM_LKF_VALBLK and
non-null to report the user the way how DLM API is used is wrong but can
be added for later, this will only fix the current behaviour. |
| In the Linux kernel, the following vulnerability has been resolved:
mm, swap: restore swap_space attr aviod kernel panic
commit 8b47299a411a ("mm, swap: mark swap address space ro and add context
debug check") made the swap address space read-only. It may lead to
kernel panic if arch_prepare_to_swap returns a failure under heavy memory
pressure as follows,
el1_abort+0x40/0x64
el1h_64_sync_handler+0x48/0xcc
el1h_64_sync+0x84/0x88
errseq_set+0x4c/0xb8 (P)
__filemap_set_wb_err+0x20/0xd0
shrink_folio_list+0xc20/0x11cc
evict_folios+0x1520/0x1be4
try_to_shrink_lruvec+0x27c/0x3dc
shrink_one+0x9c/0x228
shrink_node+0xb3c/0xeac
do_try_to_free_pages+0x170/0x4f0
try_to_free_pages+0x334/0x534
__alloc_pages_direct_reclaim+0x90/0x158
__alloc_pages_slowpath+0x334/0x588
__alloc_frozen_pages_noprof+0x224/0x2fc
__folio_alloc_noprof+0x14/0x64
vma_alloc_zeroed_movable_folio+0x34/0x44
do_pte_missing+0xad4/0x1040
handle_mm_fault+0x4a4/0x790
do_page_fault+0x288/0x5f8
do_translation_fault+0x38/0x54
do_mem_abort+0x54/0xa8
Restore swap address space as not ro to avoid the panic. |
| In the Linux kernel, the following vulnerability has been resolved:
bonding: annotate data-races around slave->last_rx
slave->last_rx and slave->target_last_arp_rx[...] can be read and written
locklessly. Add READ_ONCE() and WRITE_ONCE() annotations.
syzbot reported:
BUG: KCSAN: data-race in bond_rcv_validate / bond_rcv_validate
write to 0xffff888149f0d428 of 8 bytes by interrupt on cpu 1:
bond_rcv_validate+0x202/0x7a0 drivers/net/bonding/bond_main.c:3335
bond_handle_frame+0xde/0x5e0 drivers/net/bonding/bond_main.c:1533
__netif_receive_skb_core+0x5b1/0x1950 net/core/dev.c:6039
__netif_receive_skb_one_core net/core/dev.c:6150 [inline]
__netif_receive_skb+0x59/0x270 net/core/dev.c:6265
netif_receive_skb_internal net/core/dev.c:6351 [inline]
netif_receive_skb+0x4b/0x2d0 net/core/dev.c:6410
...
write to 0xffff888149f0d428 of 8 bytes by interrupt on cpu 0:
bond_rcv_validate+0x202/0x7a0 drivers/net/bonding/bond_main.c:3335
bond_handle_frame+0xde/0x5e0 drivers/net/bonding/bond_main.c:1533
__netif_receive_skb_core+0x5b1/0x1950 net/core/dev.c:6039
__netif_receive_skb_one_core net/core/dev.c:6150 [inline]
__netif_receive_skb+0x59/0x270 net/core/dev.c:6265
netif_receive_skb_internal net/core/dev.c:6351 [inline]
netif_receive_skb+0x4b/0x2d0 net/core/dev.c:6410
br_netif_receive_skb net/bridge/br_input.c:30 [inline]
NF_HOOK include/linux/netfilter.h:318 [inline]
...
value changed: 0x0000000100005365 -> 0x0000000100005366 |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: iwlwifi: Implement settime64 as stub for MVM/MLD PTP
Since commit dfb073d32cac ("ptp: Return -EINVAL on ptp_clock_register if
required ops are NULL"), PTP clock registered through ptp_clock_register
is required to have ptp_clock_info.settime64 set, however, neither MVM
nor MLD's PTP clock implementation sets it, resulting in warnings when
the interface starts up, like
WARNING: drivers/ptp/ptp_clock.c:325 at ptp_clock_register+0x2c8/0x6b8, CPU#1: wpa_supplicant/469
CPU: 1 UID: 0 PID: 469 Comm: wpa_supplicant Not tainted 6.18.0+ #101 PREEMPT(full)
ra: ffff800002732cd4 iwl_mvm_ptp_init+0x114/0x188 [iwlmvm]
ERA: 9000000002fdc468 ptp_clock_register+0x2c8/0x6b8
iwlwifi 0000:01:00.0: Failed to register PHC clock (-22)
I don't find an appropriate firmware interface to implement settime64()
for iwlwifi MLD/MVM, thus instead create a stub that returns
-EOPTNOTSUPP only, suppressing the warning and allowing the PTP clock to
be registered. |
| In the Linux kernel, the following vulnerability has been resolved:
LoongArch: Set correct protection_map[] for VM_NONE/VM_SHARED
For 32BIT platform _PAGE_PROTNONE is 0, so set a VMA to be VM_NONE or
VM_SHARED will make pages non-present, then cause Oops with kernel page
fault.
Fix it by set correct protection_map[] for VM_NONE/VM_SHARED, replacing
_PAGE_PROTNONE with _PAGE_PRESENT. |
| In the Linux kernel, the following vulnerability has been resolved:
gpio: loongson-64bit: Fix incorrect NULL check after devm_kcalloc()
Fix incorrect NULL check in loongson_gpio_init_irqchip().
The function checks chip->parent instead of chip->irq.parents. |
| In the Linux kernel, the following vulnerability has been resolved:
mm/slab: Add alloc_tagging_slab_free_hook for memcg_alloc_abort_single
When CONFIG_MEM_ALLOC_PROFILING_DEBUG is enabled, the following warning
may be noticed:
[ 3959.023862] ------------[ cut here ]------------
[ 3959.023891] alloc_tag was not cleared (got tag for lib/xarray.c:378)
[ 3959.023947] WARNING: ./include/linux/alloc_tag.h:155 at alloc_tag_add+0x128/0x178, CPU#6: mkfs.ntfs/113998
[ 3959.023978] Modules linked in: dns_resolver tun brd overlay exfat btrfs blake2b libblake2b xor xor_neon raid6_pq loop sctp ip6_udp_tunnel udp_tunnel ext4 crc16 mbcache jbd2 rfkill sunrpc vfat fat sg fuse nfnetlink sr_mod virtio_gpu cdrom drm_client_lib virtio_dma_buf drm_shmem_helper drm_kms_helper ghash_ce drm sm4 backlight virtio_net net_failover virtio_scsi failover virtio_console virtio_blk virtio_mmio dm_mirror dm_region_hash dm_log dm_multipath dm_mod i2c_dev aes_neon_bs aes_ce_blk [last unloaded: hwpoison_inject]
[ 3959.024170] CPU: 6 UID: 0 PID: 113998 Comm: mkfs.ntfs Kdump: loaded Tainted: G W 6.19.0-rc7+ #7 PREEMPT(voluntary)
[ 3959.024182] Tainted: [W]=WARN
[ 3959.024186] Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
[ 3959.024192] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 3959.024199] pc : alloc_tag_add+0x128/0x178
[ 3959.024207] lr : alloc_tag_add+0x128/0x178
[ 3959.024214] sp : ffff80008b696d60
[ 3959.024219] x29: ffff80008b696d60 x28: 0000000000000000 x27: 0000000000000240
[ 3959.024232] x26: 0000000000000000 x25: 0000000000000240 x24: ffff800085d17860
[ 3959.024245] x23: 0000000000402800 x22: ffff0000c0012dc0 x21: 00000000000002d0
[ 3959.024257] x20: ffff0000e6ef3318 x19: ffff800085ae0410 x18: 0000000000000000
[ 3959.024269] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 3959.024281] x14: 0000000000000000 x13: 0000000000000001 x12: ffff600064101293
[ 3959.024292] x11: 1fffe00064101292 x10: ffff600064101292 x9 : dfff800000000000
[ 3959.024305] x8 : 00009fff9befed6e x7 : ffff000320809493 x6 : 0000000000000001
[ 3959.024316] x5 : ffff000320809490 x4 : ffff600064101293 x3 : ffff800080691838
[ 3959.024328] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000d5bcd640
[ 3959.024340] Call trace:
[ 3959.024346] alloc_tag_add+0x128/0x178 (P)
[ 3959.024355] __alloc_tagging_slab_alloc_hook+0x11c/0x1a8
[ 3959.024362] kmem_cache_alloc_lru_noprof+0x1b8/0x5e8
[ 3959.024369] xas_alloc+0x304/0x4f0
[ 3959.024381] xas_create+0x1e0/0x4a0
[ 3959.024388] xas_store+0x68/0xda8
[ 3959.024395] __filemap_add_folio+0x5b0/0xbd8
[ 3959.024409] filemap_add_folio+0x16c/0x7e0
[ 3959.024416] __filemap_get_folio_mpol+0x2dc/0x9e8
[ 3959.024424] iomap_get_folio+0xfc/0x180
[ 3959.024435] __iomap_get_folio+0x2f8/0x4b8
[ 3959.024441] iomap_write_begin+0x198/0xc18
[ 3959.024448] iomap_write_iter+0x2ec/0x8f8
[ 3959.024454] iomap_file_buffered_write+0x19c/0x290
[ 3959.024461] blkdev_write_iter+0x38c/0x978
[ 3959.024470] vfs_write+0x4d4/0x928
[ 3959.024482] ksys_write+0xfc/0x1f8
[ 3959.024489] __arm64_sys_write+0x74/0xb0
[ 3959.024496] invoke_syscall+0xd4/0x258
[ 3959.024507] el0_svc_common.constprop.0+0xb4/0x240
[ 3959.024514] do_el0_svc+0x48/0x68
[ 3959.024520] el0_svc+0x40/0xf8
[ 3959.024526] el0t_64_sync_handler+0xa0/0xe8
[ 3959.024533] el0t_64_sync+0x1ac/0x1b0
[ 3959.024540] ---[ end trace 0000000000000000 ]---
When __memcg_slab_post_alloc_hook() fails, there are two different
free paths depending on whether size == 1 or size != 1. In the
kmem_cache_free_bulk() path, we do call alloc_tagging_slab_free_hook().
However, in memcg_alloc_abort_single() we don't, the above warning will be
triggered on the next allocation.
Therefore, add alloc_tagging_slab_free_hook() to the
memcg_alloc_abort_single() path. |