| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: don't use btrfs_set_item_key_safe on RAID stripe-extents
Don't use btrfs_set_item_key_safe() to modify the keys in the RAID
stripe-tree, as this can lead to corruption of the tree, which is caught
by the checks in btrfs_set_item_key_safe():
BTRFS info (device nvme1n1): leaf 49168384 gen 15 total ptrs 194 free space 8329 owner 12
BTRFS info (device nvme1n1): refs 2 lock_owner 1030 current 1030
[ snip ]
item 105 key (354549760 230 20480) itemoff 14587 itemsize 16
stride 0 devid 5 physical 67502080
item 106 key (354631680 230 4096) itemoff 14571 itemsize 16
stride 0 devid 1 physical 88559616
item 107 key (354631680 230 32768) itemoff 14555 itemsize 16
stride 0 devid 1 physical 88555520
item 108 key (354717696 230 28672) itemoff 14539 itemsize 16
stride 0 devid 2 physical 67604480
[ snip ]
BTRFS critical (device nvme1n1): slot 106 key (354631680 230 32768) new key (354635776 230 4096)
------------[ cut here ]------------
kernel BUG at fs/btrfs/ctree.c:2602!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI
CPU: 1 UID: 0 PID: 1055 Comm: fsstress Not tainted 6.13.0-rc1+ #1464
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
RIP: 0010:btrfs_set_item_key_safe+0xf7/0x270
Code: <snip>
RSP: 0018:ffffc90001337ab0 EFLAGS: 00010287
RAX: 0000000000000000 RBX: ffff8881115fd000 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000001 RDI: 00000000ffffffff
RBP: ffff888110ed6f50 R08: 00000000ffffefff R09: ffffffff8244c500
R10: 00000000ffffefff R11: 00000000ffffffff R12: ffff888100586000
R13: 00000000000000c9 R14: ffffc90001337b1f R15: ffff888110f23b58
FS: 00007f7d75c72740(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa811652c60 CR3: 0000000111398001 CR4: 0000000000370eb0
Call Trace:
<TASK>
? __die_body.cold+0x14/0x1a
? die+0x2e/0x50
? do_trap+0xca/0x110
? do_error_trap+0x65/0x80
? btrfs_set_item_key_safe+0xf7/0x270
? exc_invalid_op+0x50/0x70
? btrfs_set_item_key_safe+0xf7/0x270
? asm_exc_invalid_op+0x1a/0x20
? btrfs_set_item_key_safe+0xf7/0x270
btrfs_partially_delete_raid_extent+0xc4/0xe0
btrfs_delete_raid_extent+0x227/0x240
__btrfs_free_extent.isra.0+0x57f/0x9c0
? exc_coproc_segment_overrun+0x40/0x40
__btrfs_run_delayed_refs+0x2fa/0xe80
btrfs_run_delayed_refs+0x81/0xe0
btrfs_commit_transaction+0x2dd/0xbe0
? preempt_count_add+0x52/0xb0
btrfs_sync_file+0x375/0x4c0
do_fsync+0x39/0x70
__x64_sys_fsync+0x13/0x20
do_syscall_64+0x54/0x110
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f7d7550ef90
Code: <snip>
RSP: 002b:00007ffd70237248 EFLAGS: 00000202 ORIG_RAX: 000000000000004a
RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f7d7550ef90
RDX: 000000000000013a RSI: 000000000040eb28 RDI: 0000000000000004
RBP: 000000000000001b R08: 0000000000000078 R09: 00007ffd7023725c
R10: 00007f7d75400390 R11: 0000000000000202 R12: 028f5c28f5c28f5c
R13: 8f5c28f5c28f5c29 R14: 000000000040b520 R15: 00007f7d75c726c8
</TASK>
While the root cause of the tree order corruption isn't clear, using
btrfs_duplicate_item() to copy the item and then adjusting both the key
and the per-device physical addresses is a safe way to counter this
problem. |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix assertion failure when splitting ordered extent after transaction abort
If while we are doing a direct IO write a transaction abort happens, we
mark all existing ordered extents with the BTRFS_ORDERED_IOERR flag (done
at btrfs_destroy_ordered_extents()), and then after that if we enter
btrfs_split_ordered_extent() and the ordered extent has bytes left
(meaning we have a bio that doesn't cover the whole ordered extent, see
details at btrfs_extract_ordered_extent()), we will fail on the following
assertion at btrfs_split_ordered_extent():
ASSERT(!(flags & ~BTRFS_ORDERED_TYPE_FLAGS));
because the BTRFS_ORDERED_IOERR flag is set and the definition of
BTRFS_ORDERED_TYPE_FLAGS is just the union of all flags that identify the
type of write (regular, nocow, prealloc, compressed, direct IO, encoded).
Fix this by returning an error from btrfs_extract_ordered_extent() if we
find the BTRFS_ORDERED_IOERR flag in the ordered extent. The error will
be the error that resulted in the transaction abort or -EIO if no
transaction abort happened.
This was recently reported by syzbot with the following trace:
FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 1
CPU: 0 UID: 0 PID: 5321 Comm: syz.0.0 Not tainted 6.13.0-rc5-syzkaller #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
fail_dump lib/fault-inject.c:53 [inline]
should_fail_ex+0x3b0/0x4e0 lib/fault-inject.c:154
should_failslab+0xac/0x100 mm/failslab.c:46
slab_pre_alloc_hook mm/slub.c:4072 [inline]
slab_alloc_node mm/slub.c:4148 [inline]
__do_kmalloc_node mm/slub.c:4297 [inline]
__kmalloc_noprof+0xdd/0x4c0 mm/slub.c:4310
kmalloc_noprof include/linux/slab.h:905 [inline]
kzalloc_noprof include/linux/slab.h:1037 [inline]
btrfs_chunk_alloc_add_chunk_item+0x244/0x1100 fs/btrfs/volumes.c:5742
reserve_chunk_space+0x1ca/0x2c0 fs/btrfs/block-group.c:4292
check_system_chunk fs/btrfs/block-group.c:4319 [inline]
do_chunk_alloc fs/btrfs/block-group.c:3891 [inline]
btrfs_chunk_alloc+0x77b/0xf80 fs/btrfs/block-group.c:4187
find_free_extent_update_loop fs/btrfs/extent-tree.c:4166 [inline]
find_free_extent+0x42d1/0x5810 fs/btrfs/extent-tree.c:4579
btrfs_reserve_extent+0x422/0x810 fs/btrfs/extent-tree.c:4672
btrfs_new_extent_direct fs/btrfs/direct-io.c:186 [inline]
btrfs_get_blocks_direct_write+0x706/0xfa0 fs/btrfs/direct-io.c:321
btrfs_dio_iomap_begin+0xbb7/0x1180 fs/btrfs/direct-io.c:525
iomap_iter+0x697/0xf60 fs/iomap/iter.c:90
__iomap_dio_rw+0xeb9/0x25b0 fs/iomap/direct-io.c:702
btrfs_dio_write fs/btrfs/direct-io.c:775 [inline]
btrfs_direct_write+0x610/0xa30 fs/btrfs/direct-io.c:880
btrfs_do_write_iter+0x2a0/0x760 fs/btrfs/file.c:1397
do_iter_readv_writev+0x600/0x880
vfs_writev+0x376/0xba0 fs/read_write.c:1050
do_pwritev fs/read_write.c:1146 [inline]
__do_sys_pwritev2 fs/read_write.c:1204 [inline]
__se_sys_pwritev2+0x196/0x2b0 fs/read_write.c:1195
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f1281f85d29
RSP: 002b:00007f12819fe038 EFLAGS: 00000246 ORIG_RAX: 0000000000000148
RAX: ffffffffffffffda RBX: 00007f1282176080 RCX: 00007f1281f85d29
RDX: 0000000000000001 RSI: 0000000020000240 RDI: 0000000000000005
RBP: 00007f12819fe090 R08: 0000000000000000 R09: 0000000000000003
R10: 0000000000007000 R11: 0000000000000246 R12: 0000000000000002
R13: 0000000000000000 R14: 00007f1282176080 R15: 00007ffcb9e23328
</TASK>
BTRFS error (device loop0 state A): Transaction aborted (error -12)
BTRFS: error (device loop0 state A
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
tracing/osnoise: Fix resetting of tracepoints
If a timerlat tracer is started with the osnoise option OSNOISE_WORKLOAD
disabled, but then that option is enabled and timerlat is removed, the
tracepoints that were enabled on timerlat registration do not get
disabled. If the option is disabled again and timelat is started, then it
triggers a warning in the tracepoint code due to registering the
tracepoint again without ever disabling it.
Do not use the same user space defined options to know to disable the
tracepoints when timerlat is removed. Instead, set a global flag when it
is enabled and use that flag to know to disable the events.
~# echo NO_OSNOISE_WORKLOAD > /sys/kernel/tracing/osnoise/options
~# echo timerlat > /sys/kernel/tracing/current_tracer
~# echo OSNOISE_WORKLOAD > /sys/kernel/tracing/osnoise/options
~# echo nop > /sys/kernel/tracing/current_tracer
~# echo NO_OSNOISE_WORKLOAD > /sys/kernel/tracing/osnoise/options
~# echo timerlat > /sys/kernel/tracing/current_tracer
Triggers:
------------[ cut here ]------------
WARNING: CPU: 6 PID: 1337 at kernel/tracepoint.c:294 tracepoint_add_func+0x3b6/0x3f0
Modules linked in:
CPU: 6 UID: 0 PID: 1337 Comm: rtla Not tainted 6.13.0-rc4-test-00018-ga867c441128e-dirty #73
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:tracepoint_add_func+0x3b6/0x3f0
Code: 48 8b 53 28 48 8b 73 20 4c 89 04 24 e8 23 59 11 00 4c 8b 04 24 e9 36 fe ff ff 0f 0b b8 ea ff ff ff 45 84 e4 0f 84 68 fe ff ff <0f> 0b e9 61 fe ff ff 48 8b 7b 18 48 85 ff 0f 84 4f ff ff ff 49 8b
RSP: 0018:ffffb9b003a87ca0 EFLAGS: 00010202
RAX: 00000000ffffffef RBX: ffffffff92f30860 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff9bf59e91ccd0 RDI: ffffffff913b6410
RBP: 000000000000000a R08: 00000000000005c7 R09: 0000000000000002
R10: ffffb9b003a87ce0 R11: 0000000000000002 R12: 0000000000000001
R13: ffffb9b003a87ce0 R14: ffffffffffffffef R15: 0000000000000008
FS: 00007fce81209240(0000) GS:ffff9bf6fdd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055e99b728000 CR3: 00000001277c0002 CR4: 0000000000172ef0
Call Trace:
<TASK>
? __warn.cold+0xb7/0x14d
? tracepoint_add_func+0x3b6/0x3f0
? report_bug+0xea/0x170
? handle_bug+0x58/0x90
? exc_invalid_op+0x17/0x70
? asm_exc_invalid_op+0x1a/0x20
? __pfx_trace_sched_migrate_callback+0x10/0x10
? tracepoint_add_func+0x3b6/0x3f0
? __pfx_trace_sched_migrate_callback+0x10/0x10
? __pfx_trace_sched_migrate_callback+0x10/0x10
tracepoint_probe_register+0x78/0xb0
? __pfx_trace_sched_migrate_callback+0x10/0x10
osnoise_workload_start+0x2b5/0x370
timerlat_tracer_init+0x76/0x1b0
tracing_set_tracer+0x244/0x400
tracing_set_trace_write+0xa0/0xe0
vfs_write+0xfc/0x570
? do_sys_openat2+0x9c/0xe0
ksys_write+0x72/0xf0
do_syscall_64+0x79/0x1c0
entry_SYSCALL_64_after_hwframe+0x76/0x7e |
| In the Linux kernel, the following vulnerability has been resolved:
RDMA/mlx5: Fix a race for an ODP MR which leads to CQE with error
This patch addresses a race condition for an ODP MR that can result in a
CQE with an error on the UMR QP.
During the __mlx5_ib_dereg_mr() flow, the following sequence of calls
occurs:
mlx5_revoke_mr()
mlx5r_umr_revoke_mr()
mlx5r_umr_post_send_wait()
At this point, the lkey is freed from the hardware's perspective.
However, concurrently, mlx5_ib_invalidate_range() might be triggered by
another task attempting to invalidate a range for the same freed lkey.
This task will:
- Acquire the umem_odp->umem_mutex lock.
- Call mlx5r_umr_update_xlt() on the UMR QP.
- Since the lkey has already been freed, this can lead to a CQE error,
causing the UMR QP to enter an error state [1].
To resolve this race condition, the umem_odp->umem_mutex lock is now also
acquired as part of the mlx5_revoke_mr() scope. Upon successful revoke,
we set umem_odp->private which points to that MR to NULL, preventing any
further invalidation attempts on its lkey.
[1] From dmesg:
infiniband rocep8s0f0: dump_cqe:277:(pid 0): WC error: 6, Message: memory bind operation error
cqe_dump: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
cqe_dump: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
cqe_dump: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
cqe_dump: 00000030: 00 00 00 00 08 00 78 06 25 00 11 b9 00 0e dd d2
WARNING: CPU: 15 PID: 1506 at drivers/infiniband/hw/mlx5/umr.c:394 mlx5r_umr_post_send_wait+0x15a/0x2b0 [mlx5_ib]
Modules linked in: ip6table_mangle ip6table_natip6table_filter ip6_tables iptable_mangle xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_umad ib_ipoib ib_cm mlx5_ib ib_uverbs ib_core fuse mlx5_core
CPU: 15 UID: 0 PID: 1506 Comm: ibv_rc_pingpong Not tainted 6.12.0-rc7+ #1626
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:mlx5r_umr_post_send_wait+0x15a/0x2b0 [mlx5_ib]
[..]
Call Trace:
<TASK>
mlx5r_umr_update_xlt+0x23c/0x3e0 [mlx5_ib]
mlx5_ib_invalidate_range+0x2e1/0x330 [mlx5_ib]
__mmu_notifier_invalidate_range_start+0x1e1/0x240
zap_page_range_single+0xf1/0x1a0
madvise_vma_behavior+0x677/0x6e0
do_madvise+0x1a2/0x4b0
__x64_sys_madvise+0x25/0x30
do_syscall_64+0x6b/0x140
entry_SYSCALL_64_after_hwframe+0x76/0x7e |
| In the Linux kernel, the following vulnerability has been resolved:
nvkm/gsp: correctly advance the read pointer of GSP message queue
A GSP event message consists three parts: message header, RPC header,
message body. GSP calculates the number of pages to write from the
total size of a GSP message. This behavior can be observed from the
movement of the write pointer.
However, nvkm takes only the size of RPC header and message body as
the message size when advancing the read pointer. When handling a
two-page GSP message in the non rollback case, It wrongly takes the
message body of the previous message as the message header of the next
message. As the "message length" tends to be zero, in the calculation of
size needs to be copied (0 - size of (message header)), the size needs to
be copied will be "0xffffffxx". It also triggers a kernel panic due to a
NULL pointer error.
[ 547.614102] msg: 00000f90: ff ff ff ff ff ff ff ff 40 d7 18 fb 8b 00 00 00 ........@.......
[ 547.622533] msg: 00000fa0: 00 00 00 00 ff ff ff ff ff ff ff ff 00 00 00 00 ................
[ 547.630965] msg: 00000fb0: ff ff ff ff ff ff ff ff 00 00 00 00 ff ff ff ff ................
[ 547.639397] msg: 00000fc0: ff ff ff ff 00 00 00 00 ff ff ff ff ff ff ff ff ................
[ 547.647832] nvkm 0000:c1:00.0: gsp: peek msg rpc fn:0 len:0x0/0xffffffffffffffe0
[ 547.655225] nvkm 0000:c1:00.0: gsp: get msg rpc fn:0 len:0x0/0xffffffffffffffe0
[ 547.662532] BUG: kernel NULL pointer dereference, address: 0000000000000020
[ 547.669485] #PF: supervisor read access in kernel mode
[ 547.674624] #PF: error_code(0x0000) - not-present page
[ 547.679755] PGD 0 P4D 0
[ 547.682294] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 547.686643] CPU: 22 PID: 322 Comm: kworker/22:1 Tainted: G E 6.9.0-rc6+ #1
[ 547.694893] Hardware name: ASRockRack 1U1G-MILAN/N/ROMED8-NL, BIOS L3.12E 09/06/2022
[ 547.702626] Workqueue: events r535_gsp_msgq_work [nvkm]
[ 547.707921] RIP: 0010:r535_gsp_msg_recv+0x87/0x230 [nvkm]
[ 547.713375] Code: 00 8b 70 08 48 89 e1 31 d2 4c 89 f7 e8 12 f5 ff ff 48 89 c5 48 85 c0 0f 84 cf 00 00 00 48 81 fd 00 f0 ff ff 0f 87 c4 00 00 00 <8b> 55 10 41 8b 46 30 85 d2 0f 85 f6 00 00 00 83 f8 04 76 10 ba 05
[ 547.732119] RSP: 0018:ffffabe440f87e10 EFLAGS: 00010203
[ 547.737335] RAX: 0000000000000010 RBX: 0000000000000008 RCX: 000000000000003f
[ 547.744461] RDX: 0000000000000000 RSI: ffffabe4480a8030 RDI: 0000000000000010
[ 547.751585] RBP: 0000000000000010 R08: 0000000000000000 R09: ffffabe440f87bb0
[ 547.758707] R10: ffffabe440f87dc8 R11: 0000000000000010 R12: 0000000000000000
[ 547.765834] R13: 0000000000000000 R14: ffff9351df1e5000 R15: 0000000000000000
[ 547.772958] FS: 0000000000000000(0000) GS:ffff93708eb00000(0000) knlGS:0000000000000000
[ 547.781035] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 547.786771] CR2: 0000000000000020 CR3: 00000003cc220002 CR4: 0000000000770ef0
[ 547.793896] PKRU: 55555554
[ 547.796600] Call Trace:
[ 547.799046] <TASK>
[ 547.801152] ? __die+0x20/0x70
[ 547.804211] ? page_fault_oops+0x75/0x170
[ 547.808221] ? print_hex_dump+0x100/0x160
[ 547.812226] ? exc_page_fault+0x64/0x150
[ 547.816152] ? asm_exc_page_fault+0x22/0x30
[ 547.820341] ? r535_gsp_msg_recv+0x87/0x230 [nvkm]
[ 547.825184] r535_gsp_msgq_work+0x42/0x50 [nvkm]
[ 547.829845] process_one_work+0x196/0x3d0
[ 547.833861] worker_thread+0x2fc/0x410
[ 547.837613] ? __pfx_worker_thread+0x10/0x10
[ 547.841885] kthread+0xdf/0x110
[ 547.845031] ? __pfx_kthread+0x10/0x10
[ 547.848775] ret_from_fork+0x30/0x50
[ 547.852354] ? __pfx_kthread+0x10/0x10
[ 547.856097] ret_from_fork_asm+0x1a/0x30
[ 547.860019] </TASK>
[ 547.862208] Modules linked in: nvkm(E) gsp_log(E) snd_seq_dummy(E) snd_hrtimer(E) snd_seq(E) snd_timer(E) snd_seq_device(E) snd(E) soundcore(E) rfkill(E) qrtr(E) vfat(E) fat(E) ipmi_ssif(E) amd_atl(E) intel_rapl_msr(E) intel_rapl_common(E) amd64_edac(E) mlx5_ib(E) edac_mce_amd(E) kvm_amd
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu: bail out when failed to load fw in psp_init_cap_microcode()
In function psp_init_cap_microcode(), it should bail out when failed to
load firmware, otherwise it may cause invalid memory access. |
| In the Linux kernel, the following vulnerability has been resolved:
tracing: Do not allow mmap() of persistent ring buffer
When trying to mmap a trace instance buffer that is attached to
reserve_mem, it would crash:
BUG: unable to handle page fault for address: ffffe97bd00025c8
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 2862f3067 P4D 2862f3067 PUD 0
Oops: Oops: 0000 [#1] PREEMPT_RT SMP PTI
CPU: 4 UID: 0 PID: 981 Comm: mmap-rb Not tainted 6.14.0-rc2-test-00003-g7f1a5e3fbf9e-dirty #233
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:validate_page_before_insert+0x5/0xb0
Code: e2 01 89 d0 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 <48> 8b 46 08 a8 01 75 67 66 90 48 89 f0 8b 50 34 85 d2 74 76 48 89
RSP: 0018:ffffb148c2f3f968 EFLAGS: 00010246
RAX: ffff9fa5d3322000 RBX: ffff9fa5ccff9c08 RCX: 00000000b879ed29
RDX: ffffe97bd00025c0 RSI: ffffe97bd00025c0 RDI: ffff9fa5ccff9c08
RBP: ffffb148c2f3f9f0 R08: 0000000000000004 R09: 0000000000000004
R10: 0000000000000000 R11: 0000000000000200 R12: 0000000000000000
R13: 00007f16a18d5000 R14: ffff9fa5c48db6a8 R15: 0000000000000000
FS: 00007f16a1b54740(0000) GS:ffff9fa73df00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffe97bd00025c8 CR3: 00000001048c6006 CR4: 0000000000172ef0
Call Trace:
<TASK>
? __die_body.cold+0x19/0x1f
? __die+0x2e/0x40
? page_fault_oops+0x157/0x2b0
? search_module_extables+0x53/0x80
? validate_page_before_insert+0x5/0xb0
? kernelmode_fixup_or_oops.isra.0+0x5f/0x70
? __bad_area_nosemaphore+0x16e/0x1b0
? bad_area_nosemaphore+0x16/0x20
? do_kern_addr_fault+0x77/0x90
? exc_page_fault+0x22b/0x230
? asm_exc_page_fault+0x2b/0x30
? validate_page_before_insert+0x5/0xb0
? vm_insert_pages+0x151/0x400
__rb_map_vma+0x21f/0x3f0
ring_buffer_map+0x21b/0x2f0
tracing_buffers_mmap+0x70/0xd0
__mmap_region+0x6f0/0xbd0
mmap_region+0x7f/0x130
do_mmap+0x475/0x610
vm_mmap_pgoff+0xf2/0x1d0
ksys_mmap_pgoff+0x166/0x200
__x64_sys_mmap+0x37/0x50
x64_sys_call+0x1670/0x1d70
do_syscall_64+0xbb/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The reason was that the code that maps the ring buffer pages to user space
has:
page = virt_to_page((void *)cpu_buffer->subbuf_ids[s]);
And uses that in:
vm_insert_pages(vma, vma->vm_start, pages, &nr_pages);
But virt_to_page() does not work with vmap()'d memory which is what the
persistent ring buffer has. It is rather trivial to allow this, but for
now just disable mmap() of instances that have their ring buffer from the
reserve_mem option.
If an mmap() is performed on a persistent buffer it will return -ENODEV
just like it would if the .mmap field wasn't defined in the
file_operations structure. |
| In the Linux kernel, the following vulnerability has been resolved:
ring-buffer: Validate the persistent meta data subbuf array
The meta data for a mapped ring buffer contains an array of indexes of all
the subbuffers. The first entry is the reader page, and the rest of the
entries lay out the order of the subbuffers in how the ring buffer link
list is to be created.
The validator currently makes sure that all the entries are within the
range of 0 and nr_subbufs. But it does not check if there are any
duplicates.
While working on the ring buffer, I corrupted this array, where I added
duplicates. The validator did not catch it and created the ring buffer
link list on top of it. Luckily, the corruption was only that the reader
page was also in the writer path and only presented corrupted data but did
not crash the kernel. But if there were duplicates in the writer side,
then it could corrupt the ring buffer link list and cause a crash.
Create a bitmask array with the size of the number of subbuffers. Then
clear it. When walking through the subbuf array checking to see if the
entries are within the range, test if its bit is already set in the
subbuf_mask. If it is, then there is duplicates and fail the validation.
If not, set the corresponding bit and continue. |
| In the Linux kernel, the following vulnerability has been resolved:
sched_ext: Fix incorrect autogroup migration detection
scx_move_task() is called from sched_move_task() and tells the BPF scheduler
that cgroup migration is being committed. sched_move_task() is used by both
cgroup and autogroup migrations and scx_move_task() tried to filter out
autogroup migrations by testing the destination cgroup and PF_EXITING but
this is not enough. In fact, without explicitly tagging the thread which is
doing the cgroup migration, there is no good way to tell apart
scx_move_task() invocations for racing migration to the root cgroup and an
autogroup migration.
This led to scx_move_task() incorrectly ignoring a migration from non-root
cgroup to an autogroup of the root cgroup triggering the following warning:
WARNING: CPU: 7 PID: 1 at kernel/sched/ext.c:3725 scx_cgroup_can_attach+0x196/0x340
...
Call Trace:
<TASK>
cgroup_migrate_execute+0x5b1/0x700
cgroup_attach_task+0x296/0x400
__cgroup_procs_write+0x128/0x140
cgroup_procs_write+0x17/0x30
kernfs_fop_write_iter+0x141/0x1f0
vfs_write+0x31d/0x4a0
__x64_sys_write+0x72/0xf0
do_syscall_64+0x82/0x160
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fix it by adding an argument to sched_move_task() that indicates whether the
moving is for a cgroup or autogroup migration. After the change,
scx_move_task() is called only for cgroup migrations and renamed to
scx_cgroup_move_task(). |
| In the Linux kernel, the following vulnerability has been resolved:
net: ipv6: fix dst ref loops in rpl, seg6 and ioam6 lwtunnels
Some lwtunnels have a dst cache for post-transformation dst.
If the packet destination did not change we may end up recording
a reference to the lwtunnel in its own cache, and the lwtunnel
state will never be freed.
Discovered by the ioam6.sh test, kmemleak was recently fixed
to catch per-cpu memory leaks. I'm not sure if rpl and seg6
can actually hit this, but in principle I don't see why not. |
| In the Linux kernel, the following vulnerability has been resolved:
bcachefs: bch2_ioctl_subvolume_destroy() fixes
bch2_evict_subvolume_inodes() was getting stuck - due to incorrectly
pruning the dcache.
Also, fix missing permissions checks. |
| In the Linux kernel, the following vulnerability has been resolved:
nfsd: fix legacy client tracking initialization
Get rid of the nfsd4_legacy_tracking_ops->init() call in
check_for_legacy_methods(). That will be handled in the caller
(nfsd4_client_tracking_init()). Otherwise, we'll wind up calling
nfsd4_legacy_tracking_ops->init() twice, and the second time we'll
trigger the BUG_ON() in nfsd4_init_recdir(). |
| In the Linux kernel, the following vulnerability has been resolved:
mm: zswap: fix crypto_free_acomp() deadlock in zswap_cpu_comp_dead()
Currently, zswap_cpu_comp_dead() calls crypto_free_acomp() while holding
the per-CPU acomp_ctx mutex. crypto_free_acomp() then holds scomp_lock
(through crypto_exit_scomp_ops_async()).
On the other hand, crypto_alloc_acomp_node() holds the scomp_lock (through
crypto_scomp_init_tfm()), and then allocates memory. If the allocation
results in reclaim, we may attempt to hold the per-CPU acomp_ctx mutex.
The above dependencies can cause an ABBA deadlock. For example in the
following scenario:
(1) Task A running on CPU #1:
crypto_alloc_acomp_node()
Holds scomp_lock
Enters reclaim
Reads per_cpu_ptr(pool->acomp_ctx, 1)
(2) Task A is descheduled
(3) CPU #1 goes offline
zswap_cpu_comp_dead(CPU #1)
Holds per_cpu_ptr(pool->acomp_ctx, 1))
Calls crypto_free_acomp()
Waits for scomp_lock
(4) Task A running on CPU #2:
Waits for per_cpu_ptr(pool->acomp_ctx, 1) // Read on CPU #1
DEADLOCK
Since there is no requirement to call crypto_free_acomp() with the per-CPU
acomp_ctx mutex held in zswap_cpu_comp_dead(), move it after the mutex is
unlocked. Also move the acomp_request_free() and kfree() calls for
consistency and to avoid any potential sublte locking dependencies in the
future.
With this, only setting acomp_ctx fields to NULL occurs with the mutex
held. This is similar to how zswap_cpu_comp_prepare() only initializes
acomp_ctx fields with the mutex held, after performing all allocations
before holding the mutex.
Opportunistically, move the NULL check on acomp_ctx so that it takes place
before the mutex dereference. |
| In the Linux kernel, the following vulnerability has been resolved:
media: vimc: skip .s_stream() for stopped entities
Syzbot reported [1] a warning prompted by a check in call_s_stream()
that checks whether .s_stream() operation is warranted for unstarted
or stopped subdevs.
Add a simple fix in vimc_streamer_pipeline_terminate() ensuring that
entities skip a call to .s_stream() unless they have been previously
properly started.
[1] Syzbot report:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 5933 at drivers/media/v4l2-core/v4l2-subdev.c:460 call_s_stream+0x2df/0x350 drivers/media/v4l2-core/v4l2-subdev.c:460
Modules linked in:
CPU: 0 UID: 0 PID: 5933 Comm: syz-executor330 Not tainted 6.13.0-rc2-syzkaller-00362-g2d8308bf5b67 #0
...
Call Trace:
<TASK>
vimc_streamer_pipeline_terminate+0x218/0x320 drivers/media/test-drivers/vimc/vimc-streamer.c:62
vimc_streamer_pipeline_init drivers/media/test-drivers/vimc/vimc-streamer.c:101 [inline]
vimc_streamer_s_stream+0x650/0x9a0 drivers/media/test-drivers/vimc/vimc-streamer.c:203
vimc_capture_start_streaming+0xa1/0x130 drivers/media/test-drivers/vimc/vimc-capture.c:256
vb2_start_streaming+0x15f/0x5a0 drivers/media/common/videobuf2/videobuf2-core.c:1789
vb2_core_streamon+0x2a7/0x450 drivers/media/common/videobuf2/videobuf2-core.c:2348
vb2_streamon drivers/media/common/videobuf2/videobuf2-v4l2.c:875 [inline]
vb2_ioctl_streamon+0xf4/0x170 drivers/media/common/videobuf2/videobuf2-v4l2.c:1118
__video_do_ioctl+0xaf0/0xf00 drivers/media/v4l2-core/v4l2-ioctl.c:3122
video_usercopy+0x4d2/0x1620 drivers/media/v4l2-core/v4l2-ioctl.c:3463
v4l2_ioctl+0x1ba/0x250 drivers/media/v4l2-core/v4l2-dev.c:366
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:906 [inline]
__se_sys_ioctl fs/ioctl.c:892 [inline]
__x64_sys_ioctl+0x190/0x200 fs/ioctl.c:892
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f2b85c01b19
... |
| In the Linux kernel, the following vulnerability has been resolved:
jfs: add check read-only before txBeginAnon() call
Added a read-only check before calling `txBeginAnon` in `extAlloc`
and `extRecord`. This prevents modification attempts on a read-only
mounted filesystem, avoiding potential errors or crashes.
Call trace:
txBeginAnon+0xac/0x154
extAlloc+0xe8/0xdec fs/jfs/jfs_extent.c:78
jfs_get_block+0x340/0xb98 fs/jfs/inode.c:248
__block_write_begin_int+0x580/0x166c fs/buffer.c:2128
__block_write_begin fs/buffer.c:2177 [inline]
block_write_begin+0x98/0x11c fs/buffer.c:2236
jfs_write_begin+0x44/0x88 fs/jfs/inode.c:299 |
| In the Linux kernel, the following vulnerability has been resolved:
jfs: add check read-only before truncation in jfs_truncate_nolock()
Added a check for "read-only" mode in the `jfs_truncate_nolock`
function to avoid errors related to writing to a read-only
filesystem.
Call stack:
block_write_begin() {
jfs_write_failed() {
jfs_truncate() {
jfs_truncate_nolock() {
txEnd() {
...
log = JFS_SBI(tblk->sb)->log;
// (log == NULL)
If the `isReadOnly(ip)` condition is triggered in
`jfs_truncate_nolock`, the function execution will stop, and no
further data modification will occur. Instead, the `xtTruncate`
function will be called with the "COMMIT_WMAP" flag, preventing
modifications in "read-only" mode. |
| In the Linux kernel, the following vulnerability has been resolved:
PCI/ASPM: Fix link state exit during switch upstream function removal
Before 456d8aa37d0f ("PCI/ASPM: Disable ASPM on MFD function removal to
avoid use-after-free"), we would free the ASPM link only after the last
function on the bus pertaining to the given link was removed.
That was too late. If function 0 is removed before sibling function,
link->downstream would point to free'd memory after.
After above change, we freed the ASPM parent link state upon any function
removal on the bus pertaining to a given link.
That is too early. If the link is to a PCIe switch with MFD on the upstream
port, then removing functions other than 0 first would free a link which
still remains parent_link to the remaining downstream ports.
The resulting GPFs are especially frequent during hot-unplug, because
pciehp removes devices on the link bus in reverse order.
On that switch, function 0 is the virtual P2P bridge to the internal bus.
Free exactly when function 0 is removed -- before the parent link is
obsolete, but after all subordinate links are gone.
[kwilczynski: commit log] |
| In the Linux kernel, the following vulnerability has been resolved:
usb: xhci: Don't skip on Stopped - Length Invalid
Up until commit d56b0b2ab142 ("usb: xhci: ensure skipped isoc TDs are
returned when isoc ring is stopped") in v6.11, the driver didn't skip
missed isochronous TDs when handling Stoppend and Stopped - Length
Invalid events. Instead, it erroneously cleared the skip flag, which
would cause the ring to get stuck, as future events won't match the
missed TD which is never removed from the queue until it's cancelled.
This buggy logic seems to have been in place substantially unchanged
since the 3.x series over 10 years ago, which probably speaks first
and foremost about relative rarity of this case in normal usage, but
by the spec I see no reason why it shouldn't be possible.
After d56b0b2ab142, TDs are immediately skipped when handling those
Stopped events. This poses a potential problem in case of Stopped -
Length Invalid, which occurs either on completed TDs (likely already
given back) or Link and No-Op TRBs. Such event won't be recognized
as matching any TD (unless it's the rare Link TRB inside a TD) and
will result in skipping all pending TDs, giving them back possibly
before they are done, risking isoc data loss and maybe UAF by HW.
As a compromise, don't skip and don't clear the skip flag on this
kind of event. Then the next event will skip missed TDs. A downside
of not handling Stopped - Length Invalid on a Link inside a TD is
that if the TD is cancelled, its actual length will not be updated
to account for TRBs (silently) completed before the TD was stopped.
I had no luck producing this sequence of completion events so there
is no compelling demonstration of any resulting disaster. It may be
a very rare, obscure condition. The sole motivation for this patch
is that if such unlikely event does occur, I'd rather risk reporting
a cancelled partially done isoc frame as empty than gamble with UAF.
This will be fixed more properly by looking at Stopped event's TRB
pointer when making skipping decisions, but such rework is unlikely
to be backported to v6.12, which will stay around for a few years. |
| In the Linux kernel, the following vulnerability has been resolved:
nvmem: core: fix cleanup after dev_set_name()
If dev_set_name() fails, we leak nvmem->wp_gpio as the cleanup does not
put this. While a minimal fix for this would be to add the gpiod_put()
call, we can do better if we split device_register(), and use the
tested nvmem_release() cleanup code by initialising the device early,
and putting the device.
This results in a slightly larger fix, but results in clear code.
Note: this patch depends on "nvmem: core: initialise nvmem->id early"
and "nvmem: core: remove nvmem_config wp_gpio".
[Srini: Fixed subject line and error code handing with wp_gpio while applying.] |
| In the Linux kernel, the following vulnerability has been resolved:
Squashfs: fix handling and sanity checking of xattr_ids count
A Sysbot [1] corrupted filesystem exposes two flaws in the handling and
sanity checking of the xattr_ids count in the filesystem. Both of these
flaws cause computation overflow due to incorrect typing.
In the corrupted filesystem the xattr_ids value is 4294967071, which
stored in a signed variable becomes the negative number -225.
Flaw 1 (64-bit systems only):
The signed integer xattr_ids variable causes sign extension.
This causes variable overflow in the SQUASHFS_XATTR_*(A) macros. The
variable is first multiplied by sizeof(struct squashfs_xattr_id) where the
type of the sizeof operator is "unsigned long".
On a 64-bit system this is 64-bits in size, and causes the negative number
to be sign extended and widened to 64-bits and then become unsigned. This
produces the very large number 18446744073709548016 or 2^64 - 3600. This
number when rounded up by SQUASHFS_METADATA_SIZE - 1 (8191 bytes) and
divided by SQUASHFS_METADATA_SIZE overflows and produces a length of 0
(stored in len).
Flaw 2 (32-bit systems only):
On a 32-bit system the integer variable is not widened by the unsigned
long type of the sizeof operator (32-bits), and the signedness of the
variable has no effect due it always being treated as unsigned.
The above corrupted xattr_ids value of 4294967071, when multiplied
overflows and produces the number 4294963696 or 2^32 - 3400. This number
when rounded up by SQUASHFS_METADATA_SIZE - 1 (8191 bytes) and divided by
SQUASHFS_METADATA_SIZE overflows again and produces a length of 0.
The effect of the 0 length computation:
In conjunction with the corrupted xattr_ids field, the filesystem also has
a corrupted xattr_table_start value, where it matches the end of
filesystem value of 850.
This causes the following sanity check code to fail because the
incorrectly computed len of 0 matches the incorrect size of the table
reported by the superblock (0 bytes).
len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids);
indexes = SQUASHFS_XATTR_BLOCKS(*xattr_ids);
/*
* The computed size of the index table (len bytes) should exactly
* match the table start and end points
*/
start = table_start + sizeof(*id_table);
end = msblk->bytes_used;
if (len != (end - start))
return ERR_PTR(-EINVAL);
Changing the xattr_ids variable to be "usigned int" fixes the flaw on a
64-bit system. This relies on the fact the computation is widened by the
unsigned long type of the sizeof operator.
Casting the variable to u64 in the above macro fixes this flaw on a 32-bit
system.
It also means 64-bit systems do not implicitly rely on the type of the
sizeof operator to widen the computation.
[1] https://lore.kernel.org/lkml/000000000000cd44f005f1a0f17f@google.com/ |