| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
nfp: flower: handle acti_netdevs allocation failure
The kmalloc_array() in nfp_fl_lag_do_work() will return null, if
the physical memory has run out. As a result, if we dereference
the acti_netdevs, the null pointer dereference bugs will happen.
This patch adds a check to judge whether allocation failure occurs.
If it happens, the delayed work will be rescheduled and try again. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu: Fix potential out-of-bounds access in 'amdgpu_discovery_reg_base_init()'
The issue arises when the array 'adev->vcn.vcn_config' is accessed
before checking if the index 'adev->vcn.num_vcn_inst' is within the
bounds of the array.
The fix involves moving the bounds check before the array access. This
ensures that 'adev->vcn.num_vcn_inst' is within the bounds of the array
before it is used as an index.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1289 amdgpu_discovery_reg_base_init() error: testing array offset 'adev->vcn.num_vcn_inst' after use. |
| In the Linux kernel, the following vulnerability has been resolved:
octeontx2-af: Use separate handlers for interrupts
For PF to AF interrupt vector and VF to AF vector same
interrupt handler is registered which is causing race condition.
When two interrupts are raised to two CPUs at same time
then two cores serve same event corrupting the data. |
| In the Linux kernel, the following vulnerability has been resolved:
KVM: Always flush async #PF workqueue when vCPU is being destroyed
Always flush the per-vCPU async #PF workqueue when a vCPU is clearing its
completion queue, e.g. when a VM and all its vCPUs is being destroyed.
KVM must ensure that none of its workqueue callbacks is running when the
last reference to the KVM _module_ is put. Gifting a reference to the
associated VM prevents the workqueue callback from dereferencing freed
vCPU/VM memory, but does not prevent the KVM module from being unloaded
before the callback completes.
Drop the misguided VM refcount gifting, as calling kvm_put_kvm() from
async_pf_execute() if kvm_put_kvm() flushes the async #PF workqueue will
result in deadlock. async_pf_execute() can't return until kvm_put_kvm()
finishes, and kvm_put_kvm() can't return until async_pf_execute() finishes:
WARNING: CPU: 8 PID: 251 at virt/kvm/kvm_main.c:1435 kvm_put_kvm+0x2d/0x320 [kvm]
Modules linked in: vhost_net vhost vhost_iotlb tap kvm_intel kvm irqbypass
CPU: 8 PID: 251 Comm: kworker/8:1 Tainted: G W 6.6.0-rc1-e7af8d17224a-x86/gmem-vm #119
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Workqueue: events async_pf_execute [kvm]
RIP: 0010:kvm_put_kvm+0x2d/0x320 [kvm]
Call Trace:
<TASK>
async_pf_execute+0x198/0x260 [kvm]
process_one_work+0x145/0x2d0
worker_thread+0x27e/0x3a0
kthread+0xba/0xe0
ret_from_fork+0x2d/0x50
ret_from_fork_asm+0x11/0x20
</TASK>
---[ end trace 0000000000000000 ]---
INFO: task kworker/8:1:251 blocked for more than 120 seconds.
Tainted: G W 6.6.0-rc1-e7af8d17224a-x86/gmem-vm #119
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/8:1 state:D stack:0 pid:251 ppid:2 flags:0x00004000
Workqueue: events async_pf_execute [kvm]
Call Trace:
<TASK>
__schedule+0x33f/0xa40
schedule+0x53/0xc0
schedule_timeout+0x12a/0x140
__wait_for_common+0x8d/0x1d0
__flush_work.isra.0+0x19f/0x2c0
kvm_clear_async_pf_completion_queue+0x129/0x190 [kvm]
kvm_arch_destroy_vm+0x78/0x1b0 [kvm]
kvm_put_kvm+0x1c1/0x320 [kvm]
async_pf_execute+0x198/0x260 [kvm]
process_one_work+0x145/0x2d0
worker_thread+0x27e/0x3a0
kthread+0xba/0xe0
ret_from_fork+0x2d/0x50
ret_from_fork_asm+0x11/0x20
</TASK>
If kvm_clear_async_pf_completion_queue() actually flushes the workqueue,
then there's no need to gift async_pf_execute() a reference because all
invocations of async_pf_execute() will be forced to complete before the
vCPU and its VM are destroyed/freed. And that in turn fixes the module
unloading bug as __fput() won't do module_put() on the last vCPU reference
until the vCPU has been freed, e.g. if closing the vCPU file also puts the
last reference to the KVM module.
Note that kvm_check_async_pf_completion() may also take the work item off
the completion queue and so also needs to flush the work queue, as the
work will not be seen by kvm_clear_async_pf_completion_queue(). Waiting
on the workqueue could theoretically delay a vCPU due to waiting for the
work to complete, but that's a very, very small chance, and likely a very
small delay. kvm_arch_async_page_present_queued() unconditionally makes a
new request, i.e. will effectively delay entering the guest, so the
remaining work is really just:
trace_kvm_async_pf_completed(addr, cr2_or_gpa);
__kvm_vcpu_wake_up(vcpu);
mmput(mm);
and mmput() can't drop the last reference to the page tables if the vCPU is
still alive, i.e. the vCPU won't get stuck tearing down page tables.
Add a helper to do the flushing, specifically to deal with "wakeup all"
work items, as they aren't actually work items, i.e. are never placed in a
workqueue. Trying to flush a bogus workqueue entry rightly makes
__flush_work() complain (kudos to whoever added that sanity check).
Note, commit 5f6de5cbebee ("KVM: Prevent module exit until al
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
crypto: qat - resolve race condition during AER recovery
During the PCI AER system's error recovery process, the kernel driver
may encounter a race condition with freeing the reset_data structure's
memory. If the device restart will take more than 10 seconds the function
scheduling that restart will exit due to a timeout, and the reset_data
structure will be freed. However, this data structure is used for
completion notification after the restart is completed, which leads
to a UAF bug.
This results in a KFENCE bug notice.
BUG: KFENCE: use-after-free read in adf_device_reset_worker+0x38/0xa0 [intel_qat]
Use-after-free read at 0x00000000bc56fddf (in kfence-#142):
adf_device_reset_worker+0x38/0xa0 [intel_qat]
process_one_work+0x173/0x340
To resolve this race condition, the memory associated to the container
of the work_struct is freed on the worker if the timeout expired,
otherwise on the function that schedules the worker.
The timeout detection can be done by checking if the caller is
still waiting for completion or not by using completion_done() function. |
| In the Linux kernel, the following vulnerability has been resolved:
fat: fix uninitialized field in nostale filehandles
When fat_encode_fh_nostale() encodes file handle without a parent it
stores only first 10 bytes of the file handle. However the length of the
file handle must be a multiple of 4 so the file handle is actually 12
bytes long and the last two bytes remain uninitialized. This is not
great at we potentially leak uninitialized information with the handle
to userspace. Properly initialize the full handle length. |
| In the Linux kernel, the following vulnerability has been resolved:
mac802154: fix llsec key resources release in mac802154_llsec_key_del
mac802154_llsec_key_del() can free resources of a key directly without
following the RCU rules for waiting before the end of a grace period. This
may lead to use-after-free in case llsec_lookup_key() is traversing the
list of keys in parallel with a key deletion:
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 4 PID: 16000 at lib/refcount.c:25 refcount_warn_saturate+0x162/0x2a0
Modules linked in:
CPU: 4 PID: 16000 Comm: wpan-ping Not tainted 6.7.0 #19
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:refcount_warn_saturate+0x162/0x2a0
Call Trace:
<TASK>
llsec_lookup_key.isra.0+0x890/0x9e0
mac802154_llsec_encrypt+0x30c/0x9c0
ieee802154_subif_start_xmit+0x24/0x1e0
dev_hard_start_xmit+0x13e/0x690
sch_direct_xmit+0x2ae/0xbc0
__dev_queue_xmit+0x11dd/0x3c20
dgram_sendmsg+0x90b/0xd60
__sys_sendto+0x466/0x4c0
__x64_sys_sendto+0xe0/0x1c0
do_syscall_64+0x45/0xf0
entry_SYSCALL_64_after_hwframe+0x6e/0x76
Also, ieee802154_llsec_key_entry structures are not freed by
mac802154_llsec_key_del():
unreferenced object 0xffff8880613b6980 (size 64):
comm "iwpan", pid 2176, jiffies 4294761134 (age 60.475s)
hex dump (first 32 bytes):
78 0d 8f 18 80 88 ff ff 22 01 00 00 00 00 ad de x.......".......
00 00 00 00 00 00 00 00 03 00 cd ab 00 00 00 00 ................
backtrace:
[<ffffffff81dcfa62>] __kmem_cache_alloc_node+0x1e2/0x2d0
[<ffffffff81c43865>] kmalloc_trace+0x25/0xc0
[<ffffffff88968b09>] mac802154_llsec_key_add+0xac9/0xcf0
[<ffffffff8896e41a>] ieee802154_add_llsec_key+0x5a/0x80
[<ffffffff8892adc6>] nl802154_add_llsec_key+0x426/0x5b0
[<ffffffff86ff293e>] genl_family_rcv_msg_doit+0x1fe/0x2f0
[<ffffffff86ff46d1>] genl_rcv_msg+0x531/0x7d0
[<ffffffff86fee7a9>] netlink_rcv_skb+0x169/0x440
[<ffffffff86ff1d88>] genl_rcv+0x28/0x40
[<ffffffff86fec15c>] netlink_unicast+0x53c/0x820
[<ffffffff86fecd8b>] netlink_sendmsg+0x93b/0xe60
[<ffffffff86b91b35>] ____sys_sendmsg+0xac5/0xca0
[<ffffffff86b9c3dd>] ___sys_sendmsg+0x11d/0x1c0
[<ffffffff86b9c65a>] __sys_sendmsg+0xfa/0x1d0
[<ffffffff88eadbf5>] do_syscall_64+0x45/0xf0
[<ffffffff890000ea>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
Handle the proper resource release in the RCU callback function
mac802154_llsec_key_del_rcu().
Note that if llsec_lookup_key() finds a key, it gets a refcount via
llsec_key_get() and locally copies key id from key_entry (which is a
list element). So it's safe to call llsec_key_put() and free the list
entry after the RCU grace period elapses.
Found by Linux Verification Center (linuxtesting.org). |
| In the Linux kernel, the following vulnerability has been resolved:
mm: swap: fix race between free_swap_and_cache() and swapoff()
There was previously a theoretical window where swapoff() could run and
teardown a swap_info_struct while a call to free_swap_and_cache() was
running in another thread. This could cause, amongst other bad
possibilities, swap_page_trans_huge_swapped() (called by
free_swap_and_cache()) to access the freed memory for swap_map.
This is a theoretical problem and I haven't been able to provoke it from a
test case. But there has been agreement based on code review that this is
possible (see link below).
Fix it by using get_swap_device()/put_swap_device(), which will stall
swapoff(). There was an extra check in _swap_info_get() to confirm that
the swap entry was not free. This isn't present in get_swap_device()
because it doesn't make sense in general due to the race between getting
the reference and swapoff. So I've added an equivalent check directly in
free_swap_and_cache().
Details of how to provoke one possible issue (thanks to David Hildenbrand
for deriving this):
--8<-----
__swap_entry_free() might be the last user and result in
"count == SWAP_HAS_CACHE".
swapoff->try_to_unuse() will stop as soon as soon as si->inuse_pages==0.
So the question is: could someone reclaim the folio and turn
si->inuse_pages==0, before we completed swap_page_trans_huge_swapped().
Imagine the following: 2 MiB folio in the swapcache. Only 2 subpages are
still references by swap entries.
Process 1 still references subpage 0 via swap entry.
Process 2 still references subpage 1 via swap entry.
Process 1 quits. Calls free_swap_and_cache().
-> count == SWAP_HAS_CACHE
[then, preempted in the hypervisor etc.]
Process 2 quits. Calls free_swap_and_cache().
-> count == SWAP_HAS_CACHE
Process 2 goes ahead, passes swap_page_trans_huge_swapped(), and calls
__try_to_reclaim_swap().
__try_to_reclaim_swap()->folio_free_swap()->delete_from_swap_cache()->
put_swap_folio()->free_swap_slot()->swapcache_free_entries()->
swap_entry_free()->swap_range_free()->
...
WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries);
What stops swapoff to succeed after process 2 reclaimed the swap cache
but before process1 finished its call to swap_page_trans_huge_swapped()?
--8<----- |
| In the Linux kernel, the following vulnerability has been resolved:
drm/vmwgfx: Create debugfs ttm_resource_manager entry only if needed
The driver creates /sys/kernel/debug/dri/0/mob_ttm even when the
corresponding ttm_resource_manager is not allocated.
This leads to a crash when trying to read from this file.
Add a check to create mob_ttm, system_mob_ttm, and gmr_ttm debug file
only when the corresponding ttm_resource_manager is allocated.
crash> bt
PID: 3133409 TASK: ffff8fe4834a5000 CPU: 3 COMMAND: "grep"
#0 [ffffb954506b3b20] machine_kexec at ffffffffb2a6bec3
#1 [ffffb954506b3b78] __crash_kexec at ffffffffb2bb598a
#2 [ffffb954506b3c38] crash_kexec at ffffffffb2bb68c1
#3 [ffffb954506b3c50] oops_end at ffffffffb2a2a9b1
#4 [ffffb954506b3c70] no_context at ffffffffb2a7e913
#5 [ffffb954506b3cc8] __bad_area_nosemaphore at ffffffffb2a7ec8c
#6 [ffffb954506b3d10] do_page_fault at ffffffffb2a7f887
#7 [ffffb954506b3d40] page_fault at ffffffffb360116e
[exception RIP: ttm_resource_manager_debug+0x11]
RIP: ffffffffc04afd11 RSP: ffffb954506b3df0 RFLAGS: 00010246
RAX: ffff8fe41a6d1200 RBX: 0000000000000000 RCX: 0000000000000940
RDX: 0000000000000000 RSI: ffffffffc04b4338 RDI: 0000000000000000
RBP: ffffb954506b3e08 R8: ffff8fee3ffad000 R9: 0000000000000000
R10: ffff8fe41a76a000 R11: 0000000000000001 R12: 00000000ffffffff
R13: 0000000000000001 R14: ffff8fe5bb6f3900 R15: ffff8fe41a6d1200
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#8 [ffffb954506b3e00] ttm_resource_manager_show at ffffffffc04afde7 [ttm]
#9 [ffffb954506b3e30] seq_read at ffffffffb2d8f9f3
RIP: 00007f4c4eda8985 RSP: 00007ffdbba9e9f8 RFLAGS: 00000246
RAX: ffffffffffffffda RBX: 000000000037e000 RCX: 00007f4c4eda8985
RDX: 000000000037e000 RSI: 00007f4c41573000 RDI: 0000000000000003
RBP: 000000000037e000 R8: 0000000000000000 R9: 000000000037fe30
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f4c41573000
R13: 0000000000000003 R14: 00007f4c41572010 R15: 0000000000000003
ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b |
| In the Linux kernel, the following vulnerability has been resolved:
drm/i915/vma: Fix UAF on destroy against retire race
Object debugging tools were sporadically reporting illegal attempts to
free a still active i915 VMA object when parking a GT believed to be idle.
[161.359441] ODEBUG: free active (active state 0) object: ffff88811643b958 object type: i915_active hint: __i915_vma_active+0x0/0x50 [i915]
[161.360082] WARNING: CPU: 5 PID: 276 at lib/debugobjects.c:514 debug_print_object+0x80/0xb0
...
[161.360304] CPU: 5 PID: 276 Comm: kworker/5:2 Not tainted 6.5.0-rc1-CI_DRM_13375-g003f860e5577+ #1
[161.360314] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022
[161.360322] Workqueue: i915-unordered __intel_wakeref_put_work [i915]
[161.360592] RIP: 0010:debug_print_object+0x80/0xb0
...
[161.361347] debug_object_free+0xeb/0x110
[161.361362] i915_active_fini+0x14/0x130 [i915]
[161.361866] release_references+0xfe/0x1f0 [i915]
[161.362543] i915_vma_parked+0x1db/0x380 [i915]
[161.363129] __gt_park+0x121/0x230 [i915]
[161.363515] ____intel_wakeref_put_last+0x1f/0x70 [i915]
That has been tracked down to be happening when another thread is
deactivating the VMA inside __active_retire() helper, after the VMA's
active counter has been already decremented to 0, but before deactivation
of the VMA's object is reported to the object debugging tool.
We could prevent from that race by serializing i915_active_fini() with
__active_retire() via ref->tree_lock, but that wouldn't stop the VMA from
being used, e.g. from __i915_vma_retire() called at the end of
__active_retire(), after that VMA has been already freed by a concurrent
i915_vma_destroy() on return from the i915_active_fini(). Then, we should
rather fix the issue at the VMA level, not in i915_active.
Since __i915_vma_parked() is called from __gt_park() on last put of the
GT's wakeref, the issue could be addressed by holding the GT wakeref long
enough for __active_retire() to complete before that wakeref is released
and the GT parked.
I believe the issue was introduced by commit d93939730347 ("drm/i915:
Remove the vma refcount") which moved a call to i915_active_fini() from
a dropped i915_vma_release(), called on last put of the removed VMA kref,
to i915_vma_parked() processing path called on last put of a GT wakeref.
However, its visibility to the object debugging tool was suppressed by a
bug in i915_active that was fixed two weeks later with commit e92eb246feb9
("drm/i915/active: Fix missing debug object activation").
A VMA associated with a request doesn't acquire a GT wakeref by itself.
Instead, it depends on a wakeref held directly by the request's active
intel_context for a GT associated with its VM, and indirectly on that
intel_context's engine wakeref if the engine belongs to the same GT as the
VMA's VM. Those wakerefs are released asynchronously to VMA deactivation.
Fix the issue by getting a wakeref for the VMA's GT when activating it,
and putting that wakeref only after the VMA is deactivated. However,
exclude global GTT from that processing path, otherwise the GPU never goes
idle. Since __i915_vma_retire() may be called from atomic contexts, use
async variant of wakeref put. Also, to avoid circular locking dependency,
take care of acquiring the wakeref before VM mutex when both are needed.
v7: Add inline comments with justifications for:
- using untracked variants of intel_gt_pm_get/put() (Nirmoy),
- using async variant of _put(),
- not getting the wakeref in case of a global GTT,
- always getting the first wakeref outside vm->mutex.
v6: Since __i915_vma_active/retire() callbacks are not serialized, storing
a wakeref tracking handle inside struct i915_vma is not safe, and
there is no other good place for that. Use untracked variants of
intel_gt_pm_get/put_async().
v5: Replace "tile" with "GT" across commit description (Rodrigo),
-
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
do_sys_name_to_handle(): use kzalloc() to fix kernel-infoleak
syzbot identified a kernel information leak vulnerability in
do_sys_name_to_handle() and issued the following report [1].
[1]
"BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline]
BUG: KMSAN: kernel-infoleak in _copy_to_user+0xbc/0x100 lib/usercopy.c:40
instrument_copy_to_user include/linux/instrumented.h:114 [inline]
_copy_to_user+0xbc/0x100 lib/usercopy.c:40
copy_to_user include/linux/uaccess.h:191 [inline]
do_sys_name_to_handle fs/fhandle.c:73 [inline]
__do_sys_name_to_handle_at fs/fhandle.c:112 [inline]
__se_sys_name_to_handle_at+0x949/0xb10 fs/fhandle.c:94
__x64_sys_name_to_handle_at+0xe4/0x140 fs/fhandle.c:94
...
Uninit was created at:
slab_post_alloc_hook+0x129/0xa70 mm/slab.h:768
slab_alloc_node mm/slub.c:3478 [inline]
__kmem_cache_alloc_node+0x5c9/0x970 mm/slub.c:3517
__do_kmalloc_node mm/slab_common.c:1006 [inline]
__kmalloc+0x121/0x3c0 mm/slab_common.c:1020
kmalloc include/linux/slab.h:604 [inline]
do_sys_name_to_handle fs/fhandle.c:39 [inline]
__do_sys_name_to_handle_at fs/fhandle.c:112 [inline]
__se_sys_name_to_handle_at+0x441/0xb10 fs/fhandle.c:94
__x64_sys_name_to_handle_at+0xe4/0x140 fs/fhandle.c:94
...
Bytes 18-19 of 20 are uninitialized
Memory access of size 20 starts at ffff888128a46380
Data copied to user address 0000000020000240"
Per Chuck Lever's suggestion, use kzalloc() instead of kmalloc() to
solve the problem. |
| In the Linux kernel, the following vulnerability has been resolved:
ACPI: processor_idle: Fix memory leak in acpi_processor_power_exit()
After unregistering the CPU idle device, the memory associated with
it is not freed, leading to a memory leak:
unreferenced object 0xffff896282f6c000 (size 1024):
comm "swapper/0", pid 1, jiffies 4294893170
hex dump (first 32 bytes):
00 00 00 00 0b 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 8836a742):
[<ffffffff993495ed>] kmalloc_trace+0x29d/0x340
[<ffffffff9972f3b3>] acpi_processor_power_init+0xf3/0x1c0
[<ffffffff9972d263>] __acpi_processor_start+0xd3/0xf0
[<ffffffff9972d2bc>] acpi_processor_start+0x2c/0x50
[<ffffffff99805872>] really_probe+0xe2/0x480
[<ffffffff99805c98>] __driver_probe_device+0x78/0x160
[<ffffffff99805daf>] driver_probe_device+0x1f/0x90
[<ffffffff9980601e>] __driver_attach+0xce/0x1c0
[<ffffffff99803170>] bus_for_each_dev+0x70/0xc0
[<ffffffff99804822>] bus_add_driver+0x112/0x210
[<ffffffff99807245>] driver_register+0x55/0x100
[<ffffffff9aee4acb>] acpi_processor_driver_init+0x3b/0xc0
[<ffffffff990012d1>] do_one_initcall+0x41/0x300
[<ffffffff9ae7c4b0>] kernel_init_freeable+0x320/0x470
[<ffffffff99b231f6>] kernel_init+0x16/0x1b0
[<ffffffff99042e6d>] ret_from_fork+0x2d/0x50
Fix this by freeing the CPU idle device after unregistering it. |
| In the Linux kernel, the following vulnerability has been resolved:
RDMA/srpt: Do not register event handler until srpt device is fully setup
Upon rare occasions, KASAN reports a use-after-free Write
in srpt_refresh_port().
This seems to be because an event handler is registered before the
srpt device is fully setup and a race condition upon error may leave a
partially setup event handler in place.
Instead, only register the event handler after srpt device initialization
is complete. |
| In the Linux kernel, the following vulnerability has been resolved:
NFSv4.2: fix nfs4_listxattr kernel BUG at mm/usercopy.c:102
A call to listxattr() with a buffer size = 0 returns the actual
size of the buffer needed for a subsequent call. When size > 0,
nfs4_listxattr() does not return an error because either
generic_listxattr() or nfs4_listxattr_nfs4_label() consumes
exactly all the bytes then size is 0 when calling
nfs4_listxattr_nfs4_user() which then triggers the following
kernel BUG:
[ 99.403778] kernel BUG at mm/usercopy.c:102!
[ 99.404063] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
[ 99.408463] CPU: 0 PID: 3310 Comm: python3 Not tainted 6.6.0-61.fc40.aarch64 #1
[ 99.415827] Call trace:
[ 99.415985] usercopy_abort+0x70/0xa0
[ 99.416227] __check_heap_object+0x134/0x158
[ 99.416505] check_heap_object+0x150/0x188
[ 99.416696] __check_object_size.part.0+0x78/0x168
[ 99.416886] __check_object_size+0x28/0x40
[ 99.417078] listxattr+0x8c/0x120
[ 99.417252] path_listxattr+0x78/0xe0
[ 99.417476] __arm64_sys_listxattr+0x28/0x40
[ 99.417723] invoke_syscall+0x78/0x100
[ 99.417929] el0_svc_common.constprop.0+0x48/0xf0
[ 99.418186] do_el0_svc+0x24/0x38
[ 99.418376] el0_svc+0x3c/0x110
[ 99.418554] el0t_64_sync_handler+0x120/0x130
[ 99.418788] el0t_64_sync+0x194/0x198
[ 99.418994] Code: aa0003e3 d000a3e0 91310000 97f49bdb (d4210000)
Issue is reproduced when generic_listxattr() returns 'system.nfs4_acl',
thus calling lisxattr() with size = 16 will trigger the bug.
Add check on nfs4_listxattr() to return ERANGE error when it is
called with size > 0 and the return value is greater than size. |
| In the Linux kernel, the following vulnerability has been resolved:
net: ice: Fix potential NULL pointer dereference in ice_bridge_setlink()
The function ice_bridge_setlink() may encounter a NULL pointer dereference
if nlmsg_find_attr() returns NULL and br_spec is dereferenced subsequently
in nla_for_each_nested(). To address this issue, add a check to ensure that
br_spec is not NULL before proceeding with the nested attribute iteration. |
| In the Linux kernel, the following vulnerability has been resolved:
igc: avoid returning frame twice in XDP_REDIRECT
When a frame can not be transmitted in XDP_REDIRECT
(e.g. due to a full queue), it is necessary to free
it by calling xdp_return_frame_rx_napi.
However, this is the responsibility of the caller of
the ndo_xdp_xmit (see for example bq_xmit_all in
kernel/bpf/devmap.c) and thus calling it inside
igc_xdp_xmit (which is the ndo_xdp_xmit of the igc
driver) as well will lead to memory corruption.
In fact, bq_xmit_all expects that it can return all
frames after the last successfully transmitted one.
Therefore, break for the first not transmitted frame,
but do not call xdp_return_frame_rx_napi in igc_xdp_xmit.
This is equally implemented in other Intel drivers
such as the igb.
There are two alternatives to this that were rejected:
1. Return num_frames as all the frames would have been
transmitted and release them inside igc_xdp_xmit.
While it might work technically, it is not what
the return value is meant to represent (i.e. the
number of SUCCESSFULLY transmitted packets).
2. Rework kernel/bpf/devmap.c and all drivers to
support non-consecutively dropped packets.
Besides being complex, it likely has a negative
performance impact without a significant gain
since it is anyway unlikely that the next frame
can be transmitted if the previous one was dropped.
The memory corruption can be reproduced with
the following script which leads to a kernel panic
after a few seconds. It basically generates more
traffic than a i225 NIC can transmit and pushes it
via XDP_REDIRECT from a virtual interface to the
physical interface where frames get dropped.
#!/bin/bash
INTERFACE=enp4s0
INTERFACE_IDX=`cat /sys/class/net/$INTERFACE/ifindex`
sudo ip link add dev veth1 type veth peer name veth2
sudo ip link set up $INTERFACE
sudo ip link set up veth1
sudo ip link set up veth2
cat << EOF > redirect.bpf.c
SEC("prog")
int redirect(struct xdp_md *ctx)
{
return bpf_redirect($INTERFACE_IDX, 0);
}
char _license[] SEC("license") = "GPL";
EOF
clang -O2 -g -Wall -target bpf -c redirect.bpf.c -o redirect.bpf.o
sudo ip link set veth2 xdp obj redirect.bpf.o
cat << EOF > pass.bpf.c
SEC("prog")
int pass(struct xdp_md *ctx)
{
return XDP_PASS;
}
char _license[] SEC("license") = "GPL";
EOF
clang -O2 -g -Wall -target bpf -c pass.bpf.c -o pass.bpf.o
sudo ip link set $INTERFACE xdp obj pass.bpf.o
cat << EOF > trafgen.cfg
{
/* Ethernet Header */
0xe8, 0x6a, 0x64, 0x41, 0xbf, 0x46,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
const16(ETH_P_IP),
/* IPv4 Header */
0b01000101, 0, # IPv4 version, IHL, TOS
const16(1028), # IPv4 total length (UDP length + 20 bytes (IP header))
const16(2), # IPv4 ident
0b01000000, 0, # IPv4 flags, fragmentation off
64, # IPv4 TTL
17, # Protocol UDP
csumip(14, 33), # IPv4 checksum
/* UDP Header */
10, 0, 1, 1, # IP Src - adapt as needed
10, 0, 1, 2, # IP Dest - adapt as needed
const16(6666), # UDP Src Port
const16(6666), # UDP Dest Port
const16(1008), # UDP length (UDP header 8 bytes + payload length)
csumudp(14, 34), # UDP checksum
/* Payload */
fill('W', 1000),
}
EOF
sudo trafgen -i trafgen.cfg -b3000MB -o veth1 --cpp |
| In the Linux kernel, the following vulnerability has been resolved:
net/ipv6: avoid possible UAF in ip6_route_mpath_notify()
syzbot found another use-after-free in ip6_route_mpath_notify() [1]
Commit f7225172f25a ("net/ipv6: prevent use after free in
ip6_route_mpath_notify") was not able to fix the root cause.
We need to defer the fib6_info_release() calls after
ip6_route_mpath_notify(), in the cleanup phase.
[1]
BUG: KASAN: slab-use-after-free in rt6_fill_node+0x1460/0x1ac0
Read of size 4 at addr ffff88809a07fc64 by task syz-executor.2/23037
CPU: 0 PID: 23037 Comm: syz-executor.2 Not tainted 6.8.0-rc4-syzkaller-01035-gea7f3cfaa588 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e7/0x2e0 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x167/0x540 mm/kasan/report.c:488
kasan_report+0x142/0x180 mm/kasan/report.c:601
rt6_fill_node+0x1460/0x1ac0
inet6_rt_notify+0x13b/0x290 net/ipv6/route.c:6184
ip6_route_mpath_notify net/ipv6/route.c:5198 [inline]
ip6_route_multipath_add net/ipv6/route.c:5404 [inline]
inet6_rtm_newroute+0x1d0f/0x2300 net/ipv6/route.c:5517
rtnetlink_rcv_msg+0x885/0x1040 net/core/rtnetlink.c:6597
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2543
netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1367
netlink_sendmsg+0xa3b/0xd70 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:745
____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
RIP: 0033:0x7f73dd87dda9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f73de6550c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f73dd9ac050 RCX: 00007f73dd87dda9
RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000005
RBP: 00007f73dd8ca47a R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000006e R14: 00007f73dd9ac050 R15: 00007ffdbdeb7858
</TASK>
Allocated by task 23037:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:372 [inline]
__kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:389
kasan_kmalloc include/linux/kasan.h:211 [inline]
__do_kmalloc_node mm/slub.c:3981 [inline]
__kmalloc+0x22e/0x490 mm/slub.c:3994
kmalloc include/linux/slab.h:594 [inline]
kzalloc include/linux/slab.h:711 [inline]
fib6_info_alloc+0x2e/0xf0 net/ipv6/ip6_fib.c:155
ip6_route_info_create+0x445/0x12b0 net/ipv6/route.c:3758
ip6_route_multipath_add net/ipv6/route.c:5298 [inline]
inet6_rtm_newroute+0x744/0x2300 net/ipv6/route.c:5517
rtnetlink_rcv_msg+0x885/0x1040 net/core/rtnetlink.c:6597
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2543
netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1367
netlink_sendmsg+0xa3b/0xd70 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:745
____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
Freed by task 16:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x4e/0x60 mm/kasan/generic.c:640
poison_slab_object+0xa6/0xe0 m
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
netfilter: nf_conntrack_h323: Add protection for bmp length out of range
UBSAN load reports an exception of BRK#5515 SHIFT_ISSUE:Bitwise shifts
that are out of bounds for their data type.
vmlinux get_bitmap(b=75) + 712
<net/netfilter/nf_conntrack_h323_asn1.c:0>
vmlinux decode_seq(bs=0xFFFFFFD008037000, f=0xFFFFFFD008037018, level=134443100) + 1956
<net/netfilter/nf_conntrack_h323_asn1.c:592>
vmlinux decode_choice(base=0xFFFFFFD0080370F0, level=23843636) + 1216
<net/netfilter/nf_conntrack_h323_asn1.c:814>
vmlinux decode_seq(f=0xFFFFFFD0080371A8, level=134443500) + 812
<net/netfilter/nf_conntrack_h323_asn1.c:576>
vmlinux decode_choice(base=0xFFFFFFD008037280, level=0) + 1216
<net/netfilter/nf_conntrack_h323_asn1.c:814>
vmlinux DecodeRasMessage() + 304
<net/netfilter/nf_conntrack_h323_asn1.c:833>
vmlinux ras_help() + 684
<net/netfilter/nf_conntrack_h323_main.c:1728>
vmlinux nf_confirm() + 188
<net/netfilter/nf_conntrack_proto.c:137>
Due to abnormal data in skb->data, the extension bitmap length
exceeds 32 when decoding ras message then uses the length to make
a shift operation. It will change into negative after several loop.
UBSAN load could detect a negative shift as an undefined behaviour
and reports exception.
So we add the protection to avoid the length exceeding 32. Or else
it will return out of range error and stop decoding. |
| In the Linux kernel, the following vulnerability has been resolved:
cachefiles: fix memory leak in cachefiles_add_cache()
The following memory leak was reported after unbinding /dev/cachefiles:
==================================================================
unreferenced object 0xffff9b674176e3c0 (size 192):
comm "cachefilesd2", pid 680, jiffies 4294881224
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc ea38a44b):
[<ffffffff8eb8a1a5>] kmem_cache_alloc+0x2d5/0x370
[<ffffffff8e917f86>] prepare_creds+0x26/0x2e0
[<ffffffffc002eeef>] cachefiles_determine_cache_security+0x1f/0x120
[<ffffffffc00243ec>] cachefiles_add_cache+0x13c/0x3a0
[<ffffffffc0025216>] cachefiles_daemon_write+0x146/0x1c0
[<ffffffff8ebc4a3b>] vfs_write+0xcb/0x520
[<ffffffff8ebc5069>] ksys_write+0x69/0xf0
[<ffffffff8f6d4662>] do_syscall_64+0x72/0x140
[<ffffffff8f8000aa>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
==================================================================
Put the reference count of cache_cred in cachefiles_daemon_unbind() to
fix the problem. And also put cache_cred in cachefiles_add_cache() error
branch to avoid memory leaks. |
| In the Linux kernel, the following vulnerability has been resolved:
net: bridge: switchdev: Skip MDB replays of deferred events on offload
Before this change, generation of the list of MDB events to replay
would race against the creation of new group memberships, either from
the IGMP/MLD snooping logic or from user configuration.
While new memberships are immediately visible to walkers of
br->mdb_list, the notification of their existence to switchdev event
subscribers is deferred until a later point in time. So if a replay
list was generated during a time that overlapped with such a window,
it would also contain a replay of the not-yet-delivered event.
The driver would thus receive two copies of what the bridge internally
considered to be one single event. On destruction of the bridge, only
a single membership deletion event was therefore sent. As a
consequence of this, drivers which reference count memberships (at
least DSA), would be left with orphan groups in their hardware
database when the bridge was destroyed.
This is only an issue when replaying additions. While deletion events
may still be pending on the deferred queue, they will already have
been removed from br->mdb_list, so no duplicates can be generated in
that scenario.
To a user this meant that old group memberships, from a bridge in
which a port was previously attached, could be reanimated (in
hardware) when the port joined a new bridge, without the new bridge's
knowledge.
For example, on an mv88e6xxx system, create a snooping bridge and
immediately add a port to it:
root@infix-06-0b-00:~$ ip link add dev br0 up type bridge mcast_snooping 1 && \
> ip link set dev x3 up master br0
And then destroy the bridge:
root@infix-06-0b-00:~$ ip link del dev br0
root@infix-06-0b-00:~$ mvls atu
ADDRESS FID STATE Q F 0 1 2 3 4 5 6 7 8 9 a
DEV:0 Marvell 88E6393X
33:33:00:00:00:6a 1 static - - 0 . . . . . . . . . .
33:33:ff:87:e4:3f 1 static - - 0 . . . . . . . . . .
ff:ff:ff:ff:ff:ff 1 static - - 0 1 2 3 4 5 6 7 8 9 a
root@infix-06-0b-00:~$
The two IPv6 groups remain in the hardware database because the
port (x3) is notified of the host's membership twice: once via the
original event and once via a replay. Since only a single delete
notification is sent, the count remains at 1 when the bridge is
destroyed.
Then add the same port (or another port belonging to the same hardware
domain) to a new bridge, this time with snooping disabled:
root@infix-06-0b-00:~$ ip link add dev br1 up type bridge mcast_snooping 0 && \
> ip link set dev x3 up master br1
All multicast, including the two IPv6 groups from br0, should now be
flooded, according to the policy of br1. But instead the old
memberships are still active in the hardware database, causing the
switch to only forward traffic to those groups towards the CPU (port
0).
Eliminate the race in two steps:
1. Grab the write-side lock of the MDB while generating the replay
list.
This prevents new memberships from showing up while we are generating
the replay list. But it leaves the scenario in which a deferred event
was already generated, but not delivered, before we grabbed the
lock. Therefore:
2. Make sure that no deferred version of a replay event is already
enqueued to the switchdev deferred queue, before adding it to the
replay list, when replaying additions. |