| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
mm/page_table_check: fix crash on ZONE_DEVICE
Not all pages may apply to pgtable check. One example is ZONE_DEVICE
pages: they map PFNs directly, and they don't allocate page_ext at all
even if there's struct page around. One may reference
devm_memremap_pages().
When both ZONE_DEVICE and page-table-check enabled, then try to map some
dax memories, one can trigger kernel bug constantly now when the kernel
was trying to inject some pfn maps on the dax device:
kernel BUG at mm/page_table_check.c:55!
While it's pretty legal to use set_pxx_at() for ZONE_DEVICE pages for page
fault resolutions, skip all the checks if page_ext doesn't even exist in
pgtable checker, which applies to ZONE_DEVICE but maybe more. |
| In the Linux kernel, the following vulnerability has been resolved:
ima: Avoid blocking in RCU read-side critical section
A panic happens in ima_match_policy:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
PGD 42f873067 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 5 PID: 1286325 Comm: kubeletmonit.sh
Kdump: loaded Tainted: P
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 0.0.0 02/06/2015
RIP: 0010:ima_match_policy+0x84/0x450
Code: 49 89 fc 41 89 cf 31 ed 89 44 24 14 eb 1c 44 39
7b 18 74 26 41 83 ff 05 74 20 48 8b 1b 48 3b 1d
f2 b9 f4 00 0f 84 9c 01 00 00 <44> 85 73 10 74 ea
44 8b 6b 14 41 f6 c5 01 75 d4 41 f6 c5 02 74 0f
RSP: 0018:ff71570009e07a80 EFLAGS: 00010207
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000200
RDX: ffffffffad8dc7c0 RSI: 0000000024924925 RDI: ff3e27850dea2000
RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffffabfce739
R10: ff3e27810cc42400 R11: 0000000000000000 R12: ff3e2781825ef970
R13: 00000000ff3e2785 R14: 000000000000000c R15: 0000000000000001
FS: 00007f5195b51740(0000)
GS:ff3e278b12d40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 0000000626d24002 CR4: 0000000000361ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
ima_get_action+0x22/0x30
process_measurement+0xb0/0x830
? page_add_file_rmap+0x15/0x170
? alloc_set_pte+0x269/0x4c0
? prep_new_page+0x81/0x140
? simple_xattr_get+0x75/0xa0
? selinux_file_open+0x9d/0xf0
ima_file_check+0x64/0x90
path_openat+0x571/0x1720
do_filp_open+0x9b/0x110
? page_counter_try_charge+0x57/0xc0
? files_cgroup_alloc_fd+0x38/0x60
? __alloc_fd+0xd4/0x250
? do_sys_open+0x1bd/0x250
do_sys_open+0x1bd/0x250
do_syscall_64+0x5d/0x1d0
entry_SYSCALL_64_after_hwframe+0x65/0xca
Commit c7423dbdbc9e ("ima: Handle -ESTALE returned by
ima_filter_rule_match()") introduced call to ima_lsm_copy_rule within a
RCU read-side critical section which contains kmalloc with GFP_KERNEL.
This implies a possible sleep and violates limitations of RCU read-side
critical sections on non-PREEMPT systems.
Sleeping within RCU read-side critical section might cause
synchronize_rcu() returning early and break RCU protection, allowing a
UAF to happen.
The root cause of this issue could be described as follows:
| Thread A | Thread B |
| |ima_match_policy |
| | rcu_read_lock |
|ima_lsm_update_rule | |
| synchronize_rcu | |
| | kmalloc(GFP_KERNEL)|
| | sleep |
==> synchronize_rcu returns early
| kfree(entry) | |
| | entry = entry->next|
==> UAF happens and entry now becomes NULL (or could be anything).
| | entry->action |
==> Accessing entry might cause panic.
To fix this issue, we are converting all kmalloc that is called within
RCU read-side critical section to use GFP_ATOMIC.
[PM: fixed missing comment, long lines, !CONFIG_IMA_LSM_RULES case] |
| In the Linux kernel, the following vulnerability has been resolved:
ocfs2: fix races between hole punching and AIO+DIO
After commit "ocfs2: return real error code in ocfs2_dio_wr_get_block",
fstests/generic/300 become from always failed to sometimes failed:
========================================================================
[ 473.293420 ] run fstests generic/300
[ 475.296983 ] JBD2: Ignoring recovery information on journal
[ 475.302473 ] ocfs2: Mounting device (253,1) on (node local, slot 0) with ordered data mode.
[ 494.290998 ] OCFS2: ERROR (device dm-1): ocfs2_change_extent_flag: Owner 5668 has an extent at cpos 78723 which can no longer be found
[ 494.291609 ] On-disk corruption discovered. Please run fsck.ocfs2 once the filesystem is unmounted.
[ 494.292018 ] OCFS2: File system is now read-only.
[ 494.292224 ] (kworker/19:11,2628,19):ocfs2_mark_extent_written:5272 ERROR: status = -30
[ 494.292602 ] (kworker/19:11,2628,19):ocfs2_dio_end_io_write:2374 ERROR: status = -3
fio: io_u error on file /mnt/scratch/racer: Read-only file system: write offset=460849152, buflen=131072
=========================================================================
In __blockdev_direct_IO, ocfs2_dio_wr_get_block is called to add unwritten
extents to a list. extents are also inserted into extent tree in
ocfs2_write_begin_nolock. Then another thread call fallocate to puch a
hole at one of the unwritten extent. The extent at cpos was removed by
ocfs2_remove_extent(). At end io worker thread, ocfs2_search_extent_list
found there is no such extent at the cpos.
T1 T2 T3
inode lock
...
insert extents
...
inode unlock
ocfs2_fallocate
__ocfs2_change_file_space
inode lock
lock ip_alloc_sem
ocfs2_remove_inode_range inode
ocfs2_remove_btree_range
ocfs2_remove_extent
^---remove the extent at cpos 78723
...
unlock ip_alloc_sem
inode unlock
ocfs2_dio_end_io
ocfs2_dio_end_io_write
lock ip_alloc_sem
ocfs2_mark_extent_written
ocfs2_change_extent_flag
ocfs2_search_extent_list
^---failed to find extent
...
unlock ip_alloc_sem
In most filesystems, fallocate is not compatible with racing with AIO+DIO,
so fix it by adding to wait for all dio before fallocate/punch_hole like
ext4. |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: mac80211: mesh: Fix leak of mesh_preq_queue objects
The hwmp code use objects of type mesh_preq_queue, added to a list in
ieee80211_if_mesh, to keep track of mpath we need to resolve. If the mpath
gets deleted, ex mesh interface is removed, the entries in that list will
never get cleaned. Fix this by flushing all corresponding items of the
preq_queue in mesh_path_flush_pending().
This should take care of KASAN reports like this:
unreferenced object 0xffff00000668d800 (size 128):
comm "kworker/u8:4", pid 67, jiffies 4295419552 (age 1836.444s)
hex dump (first 32 bytes):
00 1f 05 09 00 00 ff ff 00 d5 68 06 00 00 ff ff ..........h.....
8e 97 ea eb 3e b8 01 00 00 00 00 00 00 00 00 00 ....>...........
backtrace:
[<000000007302a0b6>] __kmem_cache_alloc_node+0x1e0/0x35c
[<00000000049bd418>] kmalloc_trace+0x34/0x80
[<0000000000d792bb>] mesh_queue_preq+0x44/0x2a8
[<00000000c99c3696>] mesh_nexthop_resolve+0x198/0x19c
[<00000000926bf598>] ieee80211_xmit+0x1d0/0x1f4
[<00000000fc8c2284>] __ieee80211_subif_start_xmit+0x30c/0x764
[<000000005926ee38>] ieee80211_subif_start_xmit+0x9c/0x7a4
[<000000004c86e916>] dev_hard_start_xmit+0x174/0x440
[<0000000023495647>] __dev_queue_xmit+0xe24/0x111c
[<00000000cfe9ca78>] batadv_send_skb_packet+0x180/0x1e4
[<000000007bacc5d5>] batadv_v_elp_periodic_work+0x2f4/0x508
[<00000000adc3cd94>] process_one_work+0x4b8/0xa1c
[<00000000b36425d1>] worker_thread+0x9c/0x634
[<0000000005852dd5>] kthread+0x1bc/0x1c4
[<000000005fccd770>] ret_from_fork+0x10/0x20
unreferenced object 0xffff000009051f00 (size 128):
comm "kworker/u8:4", pid 67, jiffies 4295419553 (age 1836.440s)
hex dump (first 32 bytes):
90 d6 92 0d 00 00 ff ff 00 d8 68 06 00 00 ff ff ..........h.....
36 27 92 e4 02 e0 01 00 00 58 79 06 00 00 ff ff 6'.......Xy.....
backtrace:
[<000000007302a0b6>] __kmem_cache_alloc_node+0x1e0/0x35c
[<00000000049bd418>] kmalloc_trace+0x34/0x80
[<0000000000d792bb>] mesh_queue_preq+0x44/0x2a8
[<00000000c99c3696>] mesh_nexthop_resolve+0x198/0x19c
[<00000000926bf598>] ieee80211_xmit+0x1d0/0x1f4
[<00000000fc8c2284>] __ieee80211_subif_start_xmit+0x30c/0x764
[<000000005926ee38>] ieee80211_subif_start_xmit+0x9c/0x7a4
[<000000004c86e916>] dev_hard_start_xmit+0x174/0x440
[<0000000023495647>] __dev_queue_xmit+0xe24/0x111c
[<00000000cfe9ca78>] batadv_send_skb_packet+0x180/0x1e4
[<000000007bacc5d5>] batadv_v_elp_periodic_work+0x2f4/0x508
[<00000000adc3cd94>] process_one_work+0x4b8/0xa1c
[<00000000b36425d1>] worker_thread+0x9c/0x634
[<0000000005852dd5>] kthread+0x1bc/0x1c4
[<000000005fccd770>] ret_from_fork+0x10/0x20 |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: iwlwifi: mvm: don't read past the mfuart notifcation
In case the firmware sends a notification that claims it has more data
than it has, we will read past that was allocated for the notification.
Remove the print of the buffer, we won't see it by default. If needed,
we can see the content with tracing.
This was reported by KFENCE. |
| In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: Fix tainted pointer delete is case of flow rules creation fail
In case of flow rule creation fail in mlx5_lag_create_port_sel_table(),
instead of previously created rules, the tainted pointer is deleted
deveral times.
Fix this bug by using correct flow rules pointers.
Found by Linux Verification Center (linuxtesting.org) with SVACE. |
| In the Linux kernel, the following vulnerability has been resolved:
net: wwan: iosm: Fix tainted pointer delete is case of region creation fail
In case of region creation fail in ipc_devlink_create_region(), previously
created regions delete process starts from tainted pointer which actually
holds error code value.
Fix this bug by decreasing region index before delete.
Found by Linux Verification Center (linuxtesting.org) with SVACE. |
| In the Linux kernel, the following vulnerability has been resolved:
landlock: Fix d_parent walk
The WARN_ON_ONCE() in collect_domain_accesses() can be triggered when
trying to link a root mount point. This cannot work in practice because
this directory is mounted, but the VFS check is done after the call to
security_path_link().
Do not use source directory's d_parent when the source directory is the
mount point.
[mic: Fix commit message] |
| In the Linux kernel, the following vulnerability has been resolved:
gve: Clear napi->skb before dev_kfree_skb_any()
gve_rx_free_skb incorrectly leaves napi->skb referencing an skb after it
is freed with dev_kfree_skb_any(). This can result in a subsequent call
to napi_get_frags returning a dangling pointer.
Fix this by clearing napi->skb before the skb is freed. |
| In the Linux kernel, the following vulnerability has been resolved:
cachefiles: flush all requests after setting CACHEFILES_DEAD
In ondemand mode, when the daemon is processing an open request, if the
kernel flags the cache as CACHEFILES_DEAD, the cachefiles_daemon_write()
will always return -EIO, so the daemon can't pass the copen to the kernel.
Then the kernel process that is waiting for the copen triggers a hung_task.
Since the DEAD state is irreversible, it can only be exited by closing
/dev/cachefiles. Therefore, after calling cachefiles_io_error() to mark
the cache as CACHEFILES_DEAD, if in ondemand mode, flush all requests to
avoid the above hungtask. We may still be able to read some of the cached
data before closing the fd of /dev/cachefiles.
Note that this relies on the patch that adds reference counting to the req,
otherwise it may UAF. |
| In the Linux kernel, the following vulnerability has been resolved:
HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode()
Fix a memory leak on logi_dj_recv_send_report() error path. |
| In the Linux kernel, the following vulnerability has been resolved:
mptcp: ensure snd_una is properly initialized on connect
This is strictly related to commit fb7a0d334894 ("mptcp: ensure snd_nxt
is properly initialized on connect"). It turns out that syzkaller can
trigger the retransmit after fallback and before processing any other
incoming packet - so that snd_una is still left uninitialized.
Address the issue explicitly initializing snd_una together with snd_nxt
and write_seq. |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: iwlwifi: mvm: check n_ssids before accessing the ssids
In some versions of cfg80211, the ssids poinet might be a valid one even
though n_ssids is 0. Accessing the pointer in this case will cuase an
out-of-bound access. Fix this by checking n_ssids first. |
| In the Linux kernel, the following vulnerability has been resolved:
xhci: Handle TD clearing for multiple streams case
When multiple streams are in use, multiple TDs might be in flight when
an endpoint is stopped. We need to issue a Set TR Dequeue Pointer for
each, to ensure everything is reset properly and the caches cleared.
Change the logic so that any N>1 TDs found active for different streams
are deferred until after the first one is processed, calling
xhci_invalidate_cancelled_tds() again from xhci_handle_cmd_set_deq() to
queue another command until we are done with all of them. Also change
the error/"should never happen" paths to ensure we at least clear any
affected TDs, even if we can't issue a command to clear the hardware
cache, and complain loudly with an xhci_warn() if this ever happens.
This problem case dates back to commit e9df17eb1408 ("USB: xhci: Correct
assumptions about number of rings per endpoint.") early on in the XHCI
driver's life, when stream support was first added.
It was then identified but not fixed nor made into a warning in commit
674f8438c121 ("xhci: split handling halted endpoints into two steps"),
which added a FIXME comment for the problem case (without materially
changing the behavior as far as I can tell, though the new logic made
the problem more obvious).
Then later, in commit 94f339147fc3 ("xhci: Fix failure to give back some
cached cancelled URBs."), it was acknowledged again.
[Mathias: commit 94f339147fc3 ("xhci: Fix failure to give back some cached
cancelled URBs.") was a targeted regression fix to the previously mentioned
patch. Users reported issues with usb stuck after unmounting/disconnecting
UAS devices. This rolled back the TD clearing of multiple streams to its
original state.]
Apparently the commit author was aware of the problem (yet still chose
to submit it): It was still mentioned as a FIXME, an xhci_dbg() was
added to log the problem condition, and the remaining issue was mentioned
in the commit description. The choice of making the log type xhci_dbg()
for what is, at this point, a completely unhandled and known broken
condition is puzzling and unfortunate, as it guarantees that no actual
users would see the log in production, thereby making it nigh
undebuggable (indeed, even if you turn on DEBUG, the message doesn't
really hint at there being a problem at all).
It took me *months* of random xHC crashes to finally find a reliable
repro and be able to do a deep dive debug session, which could all have
been avoided had this unhandled, broken condition been actually reported
with a warning, as it should have been as a bug intentionally left in
unfixed (never mind that it shouldn't have been left in at all).
> Another fix to solve clearing the caches of all stream rings with
> cancelled TDs is needed, but not as urgent.
3 years after that statement and 14 years after the original bug was
introduced, I think it's finally time to fix it. And maybe next time
let's not leave bugs unfixed (that are actually worse than the original
bug), and let's actually get people to review kernel commits please.
Fixes xHC crashes and IOMMU faults with UAS devices when handling
errors/faults. Easiest repro is to use `hdparm` to mark an early sector
(e.g. 1024) on a disk as bad, then `cat /dev/sdX > /dev/null` in a loop.
At least in the case of JMicron controllers, the read errors end up
having to cancel two TDs (for two queued requests to different streams)
and the one that didn't get cleared properly ends up faulting the xHC
entirely when it tries to access DMA pages that have since been unmapped,
referred to by the stale TDs. This normally happens quickly (after two
or three loops). After this fix, I left the `cat` in a loop running
overnight and experienced no xHC failures, with all read errors
recovered properly. Repro'd and tested on an Apple M1 Mac Mini
(dwc3 host).
On systems without an IOMMU, this bug would instead silently corrupt
freed memory, making this a
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
drm/i915/dpt: Make DPT object unshrinkable
In some scenarios, the DPT object gets shrunk but
the actual framebuffer did not and thus its still
there on the DPT's vm->bound_list. Then it tries to
rewrite the PTEs via a stale CPU mapping. This causes panic.
[vsyrjala: Add TODO comment]
(cherry picked from commit 51064d471c53dcc8eddd2333c3f1c1d9131ba36c) |
| In the Linux kernel, the following vulnerability has been resolved:
net: bridge: mst: pass vlan group directly to br_mst_vlan_set_state
Pass the already obtained vlan group pointer to br_mst_vlan_set_state()
instead of dereferencing it again. Each caller has already correctly
dereferenced it for their context. This change is required for the
following suspicious RCU dereference fix. No functional changes
intended. |
| In the Linux kernel, the following vulnerability has been resolved:
net: bridge: mst: fix suspicious rcu usage in br_mst_set_state
I converted br_mst_set_state to RCU to avoid a vlan use-after-free
but forgot to change the vlan group dereference helper. Switch to vlan
group RCU deref helper to fix the suspicious rcu usage warning. |
| In the Linux kernel, the following vulnerability has been resolved:
bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send()
In case of token is released due to token->state == BNXT_HWRM_DEFERRED,
released token (set to NULL) is used in log messages. This issue is
expected to be prevented by HWRM_ERR_CODE_PF_UNAVAILABLE error code. But
this error code is returned by recent firmware. So some firmware may not
return it. This may lead to NULL pointer dereference.
Adjust this issue by adding token pointer check.
Found by Linux Verification Center (linuxtesting.org) with SVACE. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/exynos: hdmi: report safe 640x480 mode as a fallback when no EDID found
When reading EDID fails and driver reports no modes available, the DRM
core adds an artificial 1024x786 mode to the connector. Unfortunately
some variants of the Exynos HDMI (like the one in Exynos4 SoCs) are not
able to drive such mode, so report a safe 640x480 mode instead of nothing
in case of the EDID reading failure.
This fixes the following issue observed on Trats2 board since commit
13d5b040363c ("drm/exynos: do not return negative values from .get_modes()"):
[drm] Exynos DRM: using 11c00000.fimd device for DMA mapping operations
exynos-drm exynos-drm: bound 11c00000.fimd (ops fimd_component_ops)
exynos-drm exynos-drm: bound 12c10000.mixer (ops mixer_component_ops)
exynos-dsi 11c80000.dsi: [drm:samsung_dsim_host_attach] Attached s6e8aa0 device (lanes:4 bpp:24 mode-flags:0x10b)
exynos-drm exynos-drm: bound 11c80000.dsi (ops exynos_dsi_component_ops)
exynos-drm exynos-drm: bound 12d00000.hdmi (ops hdmi_component_ops)
[drm] Initialized exynos 1.1.0 20180330 for exynos-drm on minor 1
exynos-hdmi 12d00000.hdmi: [drm:hdmiphy_enable.part.0] *ERROR* PLL could not reach steady state
panel-samsung-s6e8aa0 11c80000.dsi.0: ID: 0xa2, 0x20, 0x8c
exynos-mixer 12c10000.mixer: timeout waiting for VSYNC
------------[ cut here ]------------
WARNING: CPU: 1 PID: 11 at drivers/gpu/drm/drm_atomic_helper.c:1682 drm_atomic_helper_wait_for_vblanks.part.0+0x2b0/0x2b8
[CRTC:70:crtc-1] vblank wait timed out
Modules linked in:
CPU: 1 PID: 11 Comm: kworker/u16:0 Not tainted 6.9.0-rc5-next-20240424 #14913
Hardware name: Samsung Exynos (Flattened Device Tree)
Workqueue: events_unbound deferred_probe_work_func
Call trace:
unwind_backtrace from show_stack+0x10/0x14
show_stack from dump_stack_lvl+0x68/0x88
dump_stack_lvl from __warn+0x7c/0x1c4
__warn from warn_slowpath_fmt+0x11c/0x1a8
warn_slowpath_fmt from drm_atomic_helper_wait_for_vblanks.part.0+0x2b0/0x2b8
drm_atomic_helper_wait_for_vblanks.part.0 from drm_atomic_helper_commit_tail_rpm+0x7c/0x8c
drm_atomic_helper_commit_tail_rpm from commit_tail+0x9c/0x184
commit_tail from drm_atomic_helper_commit+0x168/0x190
drm_atomic_helper_commit from drm_atomic_commit+0xb4/0xe0
drm_atomic_commit from drm_client_modeset_commit_atomic+0x23c/0x27c
drm_client_modeset_commit_atomic from drm_client_modeset_commit_locked+0x60/0x1cc
drm_client_modeset_commit_locked from drm_client_modeset_commit+0x24/0x40
drm_client_modeset_commit from __drm_fb_helper_restore_fbdev_mode_unlocked+0x9c/0xc4
__drm_fb_helper_restore_fbdev_mode_unlocked from drm_fb_helper_set_par+0x2c/0x3c
drm_fb_helper_set_par from fbcon_init+0x3d8/0x550
fbcon_init from visual_init+0xc0/0x108
visual_init from do_bind_con_driver+0x1b8/0x3a4
do_bind_con_driver from do_take_over_console+0x140/0x1ec
do_take_over_console from do_fbcon_takeover+0x70/0xd0
do_fbcon_takeover from fbcon_fb_registered+0x19c/0x1ac
fbcon_fb_registered from register_framebuffer+0x190/0x21c
register_framebuffer from __drm_fb_helper_initial_config_and_unlock+0x350/0x574
__drm_fb_helper_initial_config_and_unlock from exynos_drm_fbdev_client_hotplug+0x6c/0xb0
exynos_drm_fbdev_client_hotplug from drm_client_register+0x58/0x94
drm_client_register from exynos_drm_bind+0x160/0x190
exynos_drm_bind from try_to_bring_up_aggregate_device+0x200/0x2d8
try_to_bring_up_aggregate_device from __component_add+0xb0/0x170
__component_add from mixer_probe+0x74/0xcc
mixer_probe from platform_probe+0x5c/0xb8
platform_probe from really_probe+0xe0/0x3d8
really_probe from __driver_probe_device+0x9c/0x1e4
__driver_probe_device from driver_probe_device+0x30/0xc0
driver_probe_device from __device_attach_driver+0xa8/0x120
__device_attach_driver from bus_for_each_drv+0x80/0xcc
bus_for_each_drv from __device_attach+0xac/0x1fc
__device_attach from bus_probe_device+0x8c/0x90
bus_probe_device from deferred_probe_work_func+0
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
riscv: rewrite __kernel_map_pages() to fix sleeping in invalid context
__kernel_map_pages() is a debug function which clears the valid bit in page
table entry for deallocated pages to detect illegal memory accesses to
freed pages.
This function set/clear the valid bit using __set_memory(). __set_memory()
acquires init_mm's semaphore, and this operation may sleep. This is
problematic, because __kernel_map_pages() can be called in atomic context,
and thus is illegal to sleep. An example warning that this causes:
BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1578
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2, name: kthreadd
preempt_count: 2, expected: 0
CPU: 0 PID: 2 Comm: kthreadd Not tainted 6.9.0-g1d4c6d784ef6 #37
Hardware name: riscv-virtio,qemu (DT)
Call Trace:
[<ffffffff800060dc>] dump_backtrace+0x1c/0x24
[<ffffffff8091ef6e>] show_stack+0x2c/0x38
[<ffffffff8092baf8>] dump_stack_lvl+0x5a/0x72
[<ffffffff8092bb24>] dump_stack+0x14/0x1c
[<ffffffff8003b7ac>] __might_resched+0x104/0x10e
[<ffffffff8003b7f4>] __might_sleep+0x3e/0x62
[<ffffffff8093276a>] down_write+0x20/0x72
[<ffffffff8000cf00>] __set_memory+0x82/0x2fa
[<ffffffff8000d324>] __kernel_map_pages+0x5a/0xd4
[<ffffffff80196cca>] __alloc_pages_bulk+0x3b2/0x43a
[<ffffffff8018ee82>] __vmalloc_node_range+0x196/0x6ba
[<ffffffff80011904>] copy_process+0x72c/0x17ec
[<ffffffff80012ab4>] kernel_clone+0x60/0x2fe
[<ffffffff80012f62>] kernel_thread+0x82/0xa0
[<ffffffff8003552c>] kthreadd+0x14a/0x1be
[<ffffffff809357de>] ret_from_fork+0xe/0x1c
Rewrite this function with apply_to_existing_page_range(). It is fine to
not have any locking, because __kernel_map_pages() works with pages being
allocated/deallocated and those pages are not changed by anyone else in the
meantime. |