| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
veth: ensure skb entering GRO are not cloned.
After commit d3256efd8e8b ("veth: allow enabling NAPI even without XDP"),
if GRO is enabled on a veth device and TSO is disabled on the peer
device, TCP skbs will go through the NAPI callback. If there is no XDP
program attached, the veth code does not perform any share check, and
shared/cloned skbs could enter the GRO engine.
Ignat reported a BUG triggered later-on due to the above condition:
[ 53.970529][ C1] kernel BUG at net/core/skbuff.c:3574!
[ 53.981755][ C1] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[ 53.982634][ C1] CPU: 1 PID: 19 Comm: ksoftirqd/1 Not tainted 5.16.0-rc5+ #25
[ 53.982634][ C1] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[ 53.982634][ C1] RIP: 0010:skb_shift+0x13ef/0x23b0
[ 53.982634][ C1] Code: ea 03 0f b6 04 02 48 89 fa 83 e2 07 38 d0
7f 08 84 c0 0f 85 41 0c 00 00 41 80 7f 02 00 4d 8d b5 d0 00 00 00 0f
85 74 f5 ff ff <0f> 0b 4d 8d 77 20 be 04 00 00 00 4c 89 44 24 78 4c 89
f7 4c 89 8c
[ 53.982634][ C1] RSP: 0018:ffff8881008f7008 EFLAGS: 00010246
[ 53.982634][ C1] RAX: 0000000000000000 RBX: ffff8881180b4c80 RCX: 0000000000000000
[ 53.982634][ C1] RDX: 0000000000000002 RSI: ffff8881180b4d3c RDI: ffff88810bc9cac2
[ 53.982634][ C1] RBP: ffff8881008f70b8 R08: ffff8881180b4cf4 R09: ffff8881180b4cf0
[ 53.982634][ C1] R10: ffffed1022999e5c R11: 0000000000000002 R12: 0000000000000590
[ 53.982634][ C1] R13: ffff88810f940c80 R14: ffff88810f940d50 R15: ffff88810bc9cac0
[ 53.982634][ C1] FS: 0000000000000000(0000) GS:ffff888235880000(0000) knlGS:0000000000000000
[ 53.982634][ C1] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 53.982634][ C1] CR2: 00007ff5f9b86680 CR3: 0000000108ce8004 CR4: 0000000000170ee0
[ 53.982634][ C1] Call Trace:
[ 53.982634][ C1] <TASK>
[ 53.982634][ C1] tcp_sacktag_walk+0xaba/0x18e0
[ 53.982634][ C1] tcp_sacktag_write_queue+0xe7b/0x3460
[ 53.982634][ C1] tcp_ack+0x2666/0x54b0
[ 53.982634][ C1] tcp_rcv_established+0x4d9/0x20f0
[ 53.982634][ C1] tcp_v4_do_rcv+0x551/0x810
[ 53.982634][ C1] tcp_v4_rcv+0x22ed/0x2ed0
[ 53.982634][ C1] ip_protocol_deliver_rcu+0x96/0xaf0
[ 53.982634][ C1] ip_local_deliver_finish+0x1e0/0x2f0
[ 53.982634][ C1] ip_sublist_rcv_finish+0x211/0x440
[ 53.982634][ C1] ip_list_rcv_finish.constprop.0+0x424/0x660
[ 53.982634][ C1] ip_list_rcv+0x2c8/0x410
[ 53.982634][ C1] __netif_receive_skb_list_core+0x65c/0x910
[ 53.982634][ C1] netif_receive_skb_list_internal+0x5f9/0xcb0
[ 53.982634][ C1] napi_complete_done+0x188/0x6e0
[ 53.982634][ C1] gro_cell_poll+0x10c/0x1d0
[ 53.982634][ C1] __napi_poll+0xa1/0x530
[ 53.982634][ C1] net_rx_action+0x567/0x1270
[ 53.982634][ C1] __do_softirq+0x28a/0x9ba
[ 53.982634][ C1] run_ksoftirqd+0x32/0x60
[ 53.982634][ C1] smpboot_thread_fn+0x559/0x8c0
[ 53.982634][ C1] kthread+0x3b9/0x490
[ 53.982634][ C1] ret_from_fork+0x22/0x30
[ 53.982634][ C1] </TASK>
Address the issue by skipping the GRO stage for shared or cloned skbs.
To reduce the chance of OoO, try to unclone the skbs before giving up.
v1 -> v2:
- use avoid skb_copy and fallback to netif_receive_skb - Eric |
| In the Linux kernel, the following vulnerability has been resolved:
KVM: VMX: Always clear vmx->fail on emulation_required
Revert a relatively recent change that set vmx->fail if the vCPU is in L2
and emulation_required is true, as that behavior is completely bogus.
Setting vmx->fail and synthesizing a VM-Exit is contradictory and wrong:
(a) it's impossible to have both a VM-Fail and VM-Exit
(b) vmcs.EXIT_REASON is not modified on VM-Fail
(c) emulation_required refers to guest state and guest state checks are
always VM-Exits, not VM-Fails.
For KVM specifically, emulation_required is handled before nested exits
in __vmx_handle_exit(), thus setting vmx->fail has no immediate effect,
i.e. KVM calls into handle_invalid_guest_state() and vmx->fail is ignored.
Setting vmx->fail can ultimately result in a WARN in nested_vmx_vmexit()
firing when tearing down the VM as KVM never expects vmx->fail to be set
when L2 is active, KVM always reflects those errors into L1.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 21158 at arch/x86/kvm/vmx/nested.c:4548
nested_vmx_vmexit+0x16bd/0x17e0
arch/x86/kvm/vmx/nested.c:4547
Modules linked in:
CPU: 0 PID: 21158 Comm: syz-executor.1 Not tainted 5.16.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:nested_vmx_vmexit+0x16bd/0x17e0 arch/x86/kvm/vmx/nested.c:4547
Code: <0f> 0b e9 2e f8 ff ff e8 57 b3 5d 00 0f 0b e9 00 f1 ff ff 89 e9 80
Call Trace:
vmx_leave_nested arch/x86/kvm/vmx/nested.c:6220 [inline]
nested_vmx_free_vcpu+0x83/0xc0 arch/x86/kvm/vmx/nested.c:330
vmx_free_vcpu+0x11f/0x2a0 arch/x86/kvm/vmx/vmx.c:6799
kvm_arch_vcpu_destroy+0x6b/0x240 arch/x86/kvm/x86.c:10989
kvm_vcpu_destroy+0x29/0x90 arch/x86/kvm/../../../virt/kvm/kvm_main.c:441
kvm_free_vcpus arch/x86/kvm/x86.c:11426 [inline]
kvm_arch_destroy_vm+0x3ef/0x6b0 arch/x86/kvm/x86.c:11545
kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:1189 [inline]
kvm_put_kvm+0x751/0xe40 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1220
kvm_vcpu_release+0x53/0x60 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3489
__fput+0x3fc/0x870 fs/file_table.c:280
task_work_run+0x146/0x1c0 kernel/task_work.c:164
exit_task_work include/linux/task_work.h:32 [inline]
do_exit+0x705/0x24f0 kernel/exit.c:832
do_group_exit+0x168/0x2d0 kernel/exit.c:929
get_signal+0x1740/0x2120 kernel/signal.c:2852
arch_do_signal_or_restart+0x9c/0x730 arch/x86/kernel/signal.c:868
handle_signal_work kernel/entry/common.c:148 [inline]
exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
exit_to_user_mode_prepare+0x191/0x220 kernel/entry/common.c:207
__syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
syscall_exit_to_user_mode+0x2e/0x70 kernel/entry/common.c:300
do_syscall_64+0x53/0xd0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x44/0xae |
| In the Linux kernel, the following vulnerability has been resolved:
mm/hwpoison: clear MF_COUNT_INCREASED before retrying get_any_page()
Hulk Robot reported a panic in put_page_testzero() when testing
madvise() with MADV_SOFT_OFFLINE. The BUG() is triggered when retrying
get_any_page(). This is because we keep MF_COUNT_INCREASED flag in
second try but the refcnt is not increased.
page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)
------------[ cut here ]------------
kernel BUG at include/linux/mm.h:737!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 5 PID: 2135 Comm: sshd Tainted: G B 5.16.0-rc6-dirty #373
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: release_pages+0x53f/0x840
Call Trace:
free_pages_and_swap_cache+0x64/0x80
tlb_flush_mmu+0x6f/0x220
unmap_page_range+0xe6c/0x12c0
unmap_single_vma+0x90/0x170
unmap_vmas+0xc4/0x180
exit_mmap+0xde/0x3a0
mmput+0xa3/0x250
do_exit+0x564/0x1470
do_group_exit+0x3b/0x100
__do_sys_exit_group+0x13/0x20
__x64_sys_exit_group+0x16/0x20
do_syscall_64+0x34/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Modules linked in:
---[ end trace e99579b570fe0649 ]---
RIP: 0010:release_pages+0x53f/0x840 |
| In the Linux kernel, the following vulnerability has been resolved:
soc/tegra: regulators: Fix locking up when voltage-spread is out of range
Fix voltage coupler lockup which happens when voltage-spread is out
of range due to a bug in the code. The max-spread requirement shall be
accounted when CPU regulator doesn't have consumers. This problem is
observed on Tegra30 Ouya game console once system-wide DVFS is enabled
in a device-tree. |
| In the Linux kernel, the following vulnerability has been resolved:
udp: skip L4 aggregation for UDP tunnel packets
If NETIF_F_GRO_FRAGLIST or NETIF_F_GRO_UDP_FWD are enabled, and there
are UDP tunnels available in the system, udp_gro_receive() could end-up
doing L4 aggregation (either SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST) at
the outer UDP tunnel level for packets effectively carrying and UDP
tunnel header.
That could cause inner protocol corruption. If e.g. the relevant
packets carry a vxlan header, different vxlan ids will be ignored/
aggregated to the same GSO packet. Inner headers will be ignored, too,
so that e.g. TCP over vxlan push packets will be held in the GRO
engine till the next flush, etc.
Just skip the SKB_GSO_UDP_L4 and SKB_GSO_FRAGLIST code path if the
current packet could land in a UDP tunnel, and let udp_gro_receive()
do GRO via udp_sk(sk)->gro_receive.
The check implemented in this patch is broader than what is strictly
needed, as the existing UDP tunnel could be e.g. configured on top of
a different device: we could end-up skipping GRO at-all for some packets.
Anyhow, that is a very thin corner case and covering it will add quite
a bit of complexity.
v1 -> v2:
- hopefully clarify the commit message |
| In the Linux kernel, the following vulnerability has been resolved:
iommu/vt-d: Remove WO permissions on second-level paging entries
When the first level page table is used for IOVA translation, it only
supports Read-Only and Read-Write permissions. The Write-Only permission
is not supported as the PRESENT bit (implying Read permission) should
always set. When using second level, we still give separate permissions
that allows WriteOnly which seems inconsistent and awkward. We want to
have consistent behavior. After moving to 1st level, we don't want things
to work sometimes, and break if we use 2nd level for the same mappings.
Hence remove this configuration. |
| In the Linux kernel, the following vulnerability has been resolved:
mt76: connac: fix kernel warning adding monitor interface
Fix the following kernel warning adding a monitor interface in
mt76_connac_mcu_uni_add_dev routine.
[ 507.984882] ------------[ cut here ]------------
[ 507.989515] WARNING: CPU: 1 PID: 3017 at mt76_connac_mcu_uni_add_dev+0x178/0x190 [mt76_connac_lib]
[ 508.059379] CPU: 1 PID: 3017 Comm: ifconfig Not tainted 5.4.98 #0
[ 508.065461] Hardware name: MT7622_MT7531 RFB (DT)
[ 508.070156] pstate: 80000005 (Nzcv daif -PAN -UAO)
[ 508.074939] pc : mt76_connac_mcu_uni_add_dev+0x178/0x190 [mt76_connac_lib]
[ 508.081806] lr : mt7921_eeprom_init+0x1288/0x1cb8 [mt7921e]
[ 508.087367] sp : ffffffc013a33930
[ 508.090671] x29: ffffffc013a33930 x28: ffffff801e628ac0
[ 508.095973] x27: ffffff801c7f1200 x26: ffffff801c7eb008
[ 508.101275] x25: ffffff801c7eaef0 x24: ffffff801d025610
[ 508.106577] x23: ffffff801d022990 x22: ffffff801d024de8
[ 508.111879] x21: ffffff801d0226a0 x20: ffffff801c7eaee8
[ 508.117181] x19: ffffff801d0226a0 x18: 000000005d00b000
[ 508.122482] x17: 00000000ffffffff x16: 0000000000000000
[ 508.127785] x15: 0000000000000080 x14: ffffff801d704000
[ 508.133087] x13: 0000000000000040 x12: 0000000000000002
[ 508.138389] x11: 000000000000000c x10: 0000000000000000
[ 508.143691] x9 : 0000000000000020 x8 : 0000000000000001
[ 508.148992] x7 : 0000000000000000 x6 : 0000000000000000
[ 508.154294] x5 : ffffff801c7eaee8 x4 : 0000000000000006
[ 508.159596] x3 : 0000000000000001 x2 : 0000000000000000
[ 508.164898] x1 : ffffff801c7eac08 x0 : ffffff801d0226a0
[ 508.170200] Call trace:
[ 508.172640] mt76_connac_mcu_uni_add_dev+0x178/0x190 [mt76_connac_lib]
[ 508.179159] mt7921_eeprom_init+0x1288/0x1cb8 [mt7921e]
[ 508.184394] drv_add_interface+0x34/0x88 [mac80211]
[ 508.189271] ieee80211_add_virtual_monitor+0xe0/0xb48 [mac80211]
[ 508.195277] ieee80211_do_open+0x86c/0x918 [mac80211]
[ 508.200328] ieee80211_do_open+0x900/0x918 [mac80211]
[ 508.205372] __dev_open+0xcc/0x150
[ 508.208763] __dev_change_flags+0x134/0x198
[ 508.212937] dev_change_flags+0x20/0x60
[ 508.216764] devinet_ioctl+0x3e8/0x748
[ 508.220503] inet_ioctl+0x1e4/0x350
[ 508.223983] sock_do_ioctl+0x48/0x2a0
[ 508.227635] sock_ioctl+0x310/0x4f8
[ 508.231116] do_vfs_ioctl+0xa4/0xac0
[ 508.234681] ksys_ioctl+0x44/0x90
[ 508.237985] __arm64_sys_ioctl+0x1c/0x48
[ 508.241901] el0_svc_common.constprop.1+0x7c/0x100
[ 508.246681] el0_svc_handler+0x18/0x20
[ 508.250421] el0_svc+0x8/0x1c8
[ 508.253465] ---[ end trace c7b90fee13d72c39 ]---
[ 508.261278] ------------[ cut here ]------------ |
| In the Linux kernel, the following vulnerability has been resolved:
mt76: mt7915: fix txrate reporting
Properly check rate_info to fix unexpected reporting.
[ 1215.161863] Call trace:
[ 1215.164307] cfg80211_calculate_bitrate+0x124/0x200 [cfg80211]
[ 1215.170139] ieee80211s_update_metric+0x80/0xc0 [mac80211]
[ 1215.175624] ieee80211_tx_status_ext+0x508/0x838 [mac80211]
[ 1215.181190] mt7915_mcu_get_rx_rate+0x28c/0x8d0 [mt7915e]
[ 1215.186580] mt7915_mac_tx_free+0x324/0x7c0 [mt7915e]
[ 1215.191623] mt7915_queue_rx_skb+0xa8/0xd0 [mt7915e]
[ 1215.196582] mt76_dma_cleanup+0x7b0/0x11d0 [mt76]
[ 1215.201276] __napi_poll+0x38/0xf8
[ 1215.204668] napi_workfn+0x40/0x80
[ 1215.208062] process_one_work+0x1fc/0x390
[ 1215.212062] worker_thread+0x48/0x4d0
[ 1215.215715] kthread+0x120/0x128
[ 1215.218935] ret_from_fork+0x10/0x1c |
| In the Linux kernel, the following vulnerability has been resolved:
mt76: mt7921: fix kernel crash when the firmware fails to download
Fix kernel crash when the firmware is missing or fails to download.
[ 9.444758] kernel BUG at drivers/pci/msi.c:375!
[ 9.449363] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 9.501033] pstate: a0400009 (NzCv daif +PAN -UAO)
[ 9.505814] pc : free_msi_irqs+0x180/0x184
[ 9.509897] lr : free_msi_irqs+0x40/0x184
[ 9.513893] sp : ffffffc015193870
[ 9.517194] x29: ffffffc015193870 x28: 00000000f0e94fa2
[ 9.522492] x27: 0000000000000acd x26: 000000000000009a
[ 9.527790] x25: ffffffc0152cee58 x24: ffffffdbb383e0d8
[ 9.533087] x23: ffffffdbb38628d0 x22: 0000000000040200
[ 9.538384] x21: ffffff8cf7de7318 x20: ffffff8cd65a2480
[ 9.543681] x19: ffffff8cf7de7000 x18: 0000000000000000
[ 9.548979] x17: ffffff8cf9ca03b4 x16: ffffffdc13ad9a34
[ 9.554277] x15: 0000000000000000 x14: 0000000000080800
[ 9.559575] x13: ffffff8cd65a2980 x12: 0000000000000000
[ 9.564873] x11: ffffff8cfa45d820 x10: ffffff8cfa45d6d0
[ 9.570171] x9 : 0000000000000040 x8 : ffffff8ccef1b780
[ 9.575469] x7 : aaaaaaaaaaaaaaaa x6 : 0000000000000000
[ 9.580766] x5 : ffffffdc13824900 x4 : ffffff8ccefe0000
[ 9.586063] x3 : 0000000000000000 x2 : 0000000000000000
[ 9.591362] x1 : 0000000000000125 x0 : ffffff8ccefe0000
[ 9.596660] Call trace:
[ 9.599095] free_msi_irqs+0x180/0x184
[ 9.602831] pci_disable_msi+0x100/0x130
[ 9.606740] pci_free_irq_vectors+0x24/0x30
[ 9.610915] mt7921_pci_probe+0xbc/0x250 [mt7921e]
[ 9.615693] pci_device_probe+0xd4/0x14c
[ 9.619604] really_probe+0x134/0x2ec
[ 9.623252] driver_probe_device+0x64/0xfc
[ 9.627335] device_driver_attach+0x4c/0x6c
[ 9.631506] __driver_attach+0xac/0xc0
[ 9.635243] bus_for_each_dev+0x8c/0xd4
[ 9.639066] driver_attach+0x2c/0x38
[ 9.642628] bus_add_driver+0xfc/0x1d0
[ 9.646365] driver_register+0x64/0xf8
[ 9.650101] __pci_register_driver+0x6c/0x7c
[ 9.654360] init_module+0x28/0xfdc [mt7921e]
[ 9.658704] do_one_initcall+0x13c/0x2d0
[ 9.662615] do_init_module+0x58/0x1e8
[ 9.666351] load_module+0xd80/0xeb4
[ 9.669912] __arm64_sys_finit_module+0xa8/0xe0
[ 9.674430] el0_svc_common+0xa4/0x16c
[ 9.678168] el0_svc_compat_handler+0x2c/0x40
[ 9.682511] el0_svc_compat+0x8/0x10
[ 9.686076] Code: a94257f6 f9400bf7 a8c47bfd d65f03c0 (d4210000)
[ 9.692155] ---[ end trace 7621f966afbf0a29 ]---
[ 9.697385] Kernel panic - not syncing: Fatal exception
[ 9.702599] SMP: stopping secondary CPUs
[ 9.706549] Kernel Offset: 0x1c03600000 from 0xffffffc010000000
[ 9.712456] PHYS_OFFSET: 0xfffffff440000000
[ 9.716625] CPU features: 0x080026,2a80aa18
[ 9.720795] Memory Limit: none |
| In the Linux kernel, the following vulnerability has been resolved:
powerpc/64: Fix the definition of the fixmap area
At the time being, the fixmap area is defined at the top of
the address space or just below KASAN.
This definition is not valid for PPC64.
For PPC64, use the top of the I/O space.
Because of circular dependencies, it is not possible to include
asm/fixmap.h in asm/book3s/64/pgtable.h , so define a fixed size
AREA at the top of the I/O space for fixmap and ensure during
build that the size is big enough. |
| In the Linux kernel, the following vulnerability has been resolved:
m68k: mvme147,mvme16x: Don't wipe PCC timer config bits
Don't clear the timer 1 configuration bits when clearing the interrupt flag
and counter overflow. As Michael reported, "This results in no timer
interrupts being delivered after the first. Initialization then hangs
in calibrate_delay as the jiffies counter is not updated."
On mvme16x, enable the timer after requesting the irq, consistent with
mvme147. |
| In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix to avoid touching checkpointed data in get_victim()
In CP disabling mode, there are two issues when using LFS or SSR | AT_SSR
mode to select victim:
1. LFS is set to find source section during GC, the victim should have
no checkpointed data, since after GC, section could not be set free for
reuse.
Previously, we only check valid chpt blocks in current segment rather
than section, fix it.
2. SSR | AT_SSR are set to find target segment for writes which can be
fully filled by checkpointed and newly written blocks, we should never
select such segment, otherwise it can cause panic or data corruption
during allocation, potential case is described as below:
a) target segment has 'n' (n < 512) ckpt valid blocks
b) GC migrates 'n' valid blocks to other segment (segment is still
in dirty list)
c) GC migrates '512 - n' blocks to target segment (segment has 'n'
cp_vblocks and '512 - n' vblocks)
d) If GC selects target segment via {AT,}SSR allocator, however there
is no free space in targe segment. |
| In the Linux kernel, the following vulnerability has been resolved:
arm64: entry: always set GIC_PRIO_PSR_I_SET during entry
Zenghui reports that booting a kernel with "irqchip.gicv3_pseudo_nmi=1"
on the command line hits a warning during kernel entry, due to the way
we manipulate the PMR.
Early in the entry sequence, we call lockdep_hardirqs_off() to inform
lockdep that interrupts have been masked (as the HW sets DAIF wqhen
entering an exception). Architecturally PMR_EL1 is not affected by
exception entry, and we don't set GIC_PRIO_PSR_I_SET in the PMR early in
the exception entry sequence, so early in exception entry the PMR can
indicate that interrupts are unmasked even though they are masked by
DAIF.
If DEBUG_LOCKDEP is selected, lockdep_hardirqs_off() will check that
interrupts are masked, before we set GIC_PRIO_PSR_I_SET in any of the
exception entry paths, and hence lockdep_hardirqs_off() will WARN() that
something is amiss.
We can avoid this by consistently setting GIC_PRIO_PSR_I_SET during
exception entry so that kernel code sees a consistent environment. We
must also update local_daif_inherit() to undo this, as currently only
touches DAIF. For other paths, local_daif_restore() will update both
DAIF and the PMR. With this done, we can remove the existing special
cases which set this later in the entry code.
We always use (GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET) for consistency with
local_daif_save(), as this will warn if it ever encounters
(GIC_PRIO_IRQOFF | GIC_PRIO_PSR_I_SET), and never sets this itself. This
matches the gic_prio_kentry_setup that we have to retain for
ret_to_user.
The original splat from Zenghui's report was:
| DEBUG_LOCKS_WARN_ON(!irqs_disabled())
| WARNING: CPU: 3 PID: 125 at kernel/locking/lockdep.c:4258 lockdep_hardirqs_off+0xd4/0xe8
| Modules linked in:
| CPU: 3 PID: 125 Comm: modprobe Tainted: G W 5.12.0-rc8+ #463
| Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO BTYPE=--)
| pc : lockdep_hardirqs_off+0xd4/0xe8
| lr : lockdep_hardirqs_off+0xd4/0xe8
| sp : ffff80002a39bad0
| pmr_save: 000000e0
| x29: ffff80002a39bad0 x28: ffff0000de214bc0
| x27: ffff0000de1c0400 x26: 000000000049b328
| x25: 0000000000406f30 x24: ffff0000de1c00a0
| x23: 0000000020400005 x22: ffff8000105f747c
| x21: 0000000096000044 x20: 0000000000498ef9
| x19: ffff80002a39bc88 x18: ffffffffffffffff
| x17: 0000000000000000 x16: ffff800011c61eb0
| x15: ffff800011700a88 x14: 0720072007200720
| x13: 0720072007200720 x12: 0720072007200720
| x11: 0720072007200720 x10: 0720072007200720
| x9 : ffff80002a39bad0 x8 : ffff80002a39bad0
| x7 : ffff8000119f0800 x6 : c0000000ffff7fff
| x5 : ffff8000119f07a8 x4 : 0000000000000001
| x3 : 9bcdab23f2432800 x2 : ffff800011730538
| x1 : 9bcdab23f2432800 x0 : 0000000000000000
| Call trace:
| lockdep_hardirqs_off+0xd4/0xe8
| enter_from_kernel_mode.isra.5+0x7c/0xa8
| el1_abort+0x24/0x100
| el1_sync_handler+0x80/0xd0
| el1_sync+0x6c/0x100
| __arch_clear_user+0xc/0x90
| load_elf_binary+0x9fc/0x1450
| bprm_execve+0x404/0x880
| kernel_execve+0x180/0x188
| call_usermodehelper_exec_async+0xdc/0x158
| ret_from_fork+0x10/0x18 |
| In the Linux kernel, the following vulnerability has been resolved:
KVM: nVMX: Always make an attempt to map eVMCS after migration
When enlightened VMCS is in use and nested state is migrated with
vmx_get_nested_state()/vmx_set_nested_state() KVM can't map evmcs
page right away: evmcs gpa is not 'struct kvm_vmx_nested_state_hdr'
and we can't read it from VP assist page because userspace may decide
to restore HV_X64_MSR_VP_ASSIST_PAGE after restoring nested state
(and QEMU, for example, does exactly that). To make sure eVMCS is
mapped /vmx_set_nested_state() raises KVM_REQ_GET_NESTED_STATE_PAGES
request.
Commit f2c7ef3ba955 ("KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES
on nested vmexit") added KVM_REQ_GET_NESTED_STATE_PAGES clearing to
nested_vmx_vmexit() to make sure MSR permission bitmap is not switched
when an immediate exit from L2 to L1 happens right after migration (caused
by a pending event, for example). Unfortunately, in the exact same
situation we still need to have eVMCS mapped so
nested_sync_vmcs12_to_shadow() reflects changes in VMCS12 to eVMCS.
As a band-aid, restore nested_get_evmcs_page() when clearing
KVM_REQ_GET_NESTED_STATE_PAGES in nested_vmx_vmexit(). The 'fix' is far
from being ideal as we can't easily propagate possible failures and even if
we could, this is most likely already too late to do so. The whole
'KVM_REQ_GET_NESTED_STATE_PAGES' idea for mapping eVMCS after migration
seems to be fragile as we diverge too much from the 'native' path when
vmptr loading happens on vmx_set_nested_state(). |
| In the Linux kernel, the following vulnerability has been resolved:
drm/i915: Fix crash in auto_retire
The retire logic uses the 2 lower bits of the pointer to the retire
function to store flags. However, the auto_retire function is not
guaranteed to be aligned to a multiple of 4, which causes crashes as
we jump to the wrong address, for example like this:
2021-04-24T18:03:53.804300Z WARNING kernel: [ 516.876901] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
2021-04-24T18:03:53.804310Z WARNING kernel: [ 516.876906] CPU: 7 PID: 146 Comm: kworker/u16:6 Tainted: G U 5.4.105-13595-g3cd84167b2df #1
2021-04-24T18:03:53.804311Z WARNING kernel: [ 516.876907] Hardware name: Google Volteer2/Volteer2, BIOS Google_Volteer2.13672.76.0 02/22/2021
2021-04-24T18:03:53.804312Z WARNING kernel: [ 516.876911] Workqueue: events_unbound active_work
2021-04-24T18:03:53.804313Z WARNING kernel: [ 516.876914] RIP: 0010:auto_retire+0x1/0x20
2021-04-24T18:03:53.804314Z WARNING kernel: [ 516.876916] Code: e8 01 f2 ff ff eb 02 31 db 48 89 d8 5b 5d c3 0f 1f 44 00 00 55 48 89 e5 f0 ff 87 c8 00 00 00 0f 88 ab 47 4a 00 31 c0 5d c3 0f <1f> 44 00 00 55 48 89 e5 f0 ff 8f c8 00 00 00 0f 88 9a 47 4a 00 74
2021-04-24T18:03:53.804319Z WARNING kernel: [ 516.876918] RSP: 0018:ffff9b4d809fbe38 EFLAGS: 00010286
2021-04-24T18:03:53.804320Z WARNING kernel: [ 516.876919] RAX: 0000000000000007 RBX: ffff927915079600 RCX: 0000000000000007
2021-04-24T18:03:53.804320Z WARNING kernel: [ 516.876921] RDX: ffff9b4d809fbe40 RSI: 0000000000000286 RDI: ffff927915079600
2021-04-24T18:03:53.804321Z WARNING kernel: [ 516.876922] RBP: ffff9b4d809fbe68 R08: 8080808080808080 R09: fefefefefefefeff
2021-04-24T18:03:53.804321Z WARNING kernel: [ 516.876924] R10: 0000000000000010 R11: ffffffff92e44bd8 R12: ffff9279150796a0
2021-04-24T18:03:53.804322Z WARNING kernel: [ 516.876925] R13: ffff92791c368180 R14: ffff927915079640 R15: 000000001c867605
2021-04-24T18:03:53.804323Z WARNING kernel: [ 516.876926] FS: 0000000000000000(0000) GS:ffff92791ffc0000(0000) knlGS:0000000000000000
2021-04-24T18:03:53.804323Z WARNING kernel: [ 516.876928] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2021-04-24T18:03:53.804324Z WARNING kernel: [ 516.876929] CR2: 0000239514955000 CR3: 00000007f82da001 CR4: 0000000000760ee0
2021-04-24T18:03:53.804325Z WARNING kernel: [ 516.876930] PKRU: 55555554
2021-04-24T18:03:53.804325Z WARNING kernel: [ 516.876931] Call Trace:
2021-04-24T18:03:53.804326Z WARNING kernel: [ 516.876935] __active_retire+0x77/0xcf
2021-04-24T18:03:53.804326Z WARNING kernel: [ 516.876939] process_one_work+0x1da/0x394
2021-04-24T18:03:53.804327Z WARNING kernel: [ 516.876941] worker_thread+0x216/0x375
2021-04-24T18:03:53.804327Z WARNING kernel: [ 516.876944] kthread+0x147/0x156
2021-04-24T18:03:53.804335Z WARNING kernel: [ 516.876946] ? pr_cont_work+0x58/0x58
2021-04-24T18:03:53.804335Z WARNING kernel: [ 516.876948] ? kthread_blkcg+0x2e/0x2e
2021-04-24T18:03:53.804336Z WARNING kernel: [ 516.876950] ret_from_fork+0x1f/0x40
2021-04-24T18:03:53.804336Z WARNING kernel: [ 516.876952] Modules linked in: cdc_mbim cdc_ncm cdc_wdm xt_cgroup rfcomm cmac algif_hash algif_skcipher af_alg xt_MASQUERADE uinput snd_soc_rt5682_sdw snd_soc_rt5682 snd_soc_max98373_sdw snd_soc_max98373 snd_soc_rl6231 regmap_sdw snd_soc_sof_sdw snd_soc_hdac_hdmi snd_soc_dmic snd_hda_codec_hdmi snd_sof_pci snd_sof_intel_hda_common intel_ipu6_psys snd_sof_xtensa_dsp soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof snd_soc_hdac_hda snd_soc_acpi_intel_match snd_soc_acpi snd_hda_ext_core soundwire_bus snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core intel_ipu6_isys videobuf2_dma_contig videobuf2_v4l2 videobuf2_common videobuf2_memops mei_hdcp intel_ipu6 ov2740 ov8856 at24 sx9310 dw9768 v4l2_fwnode cros_ec_typec intel_pmc_mux roles acpi_als typec fuse iio_trig_sysfs cros_ec_light_prox cros_ec_lid_angle cros_ec_sensors cros
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
bus: mhi: pci_generic: Remove WQ_MEM_RECLAIM flag from state workqueue
A recent change created a dedicated workqueue for the state-change work
with WQ_HIGHPRI (no strong reason for that) and WQ_MEM_RECLAIM flags,
but the state-change work (mhi_pm_st_worker) does not guarantee forward
progress under memory pressure, and will even wait on various memory
allocations when e.g. creating devices, loading firmware, etc... The
work is then not part of a memory reclaim path...
Moreover, this causes a warning in check_flush_dependency() since we end
up in code that flushes a non-reclaim workqueue:
[ 40.969601] workqueue: WQ_MEM_RECLAIM mhi_hiprio_wq:mhi_pm_st_worker [mhi] is flushing !WQ_MEM_RECLAIM events_highpri:flush_backlog
[ 40.969612] WARNING: CPU: 4 PID: 158 at kernel/workqueue.c:2607 check_flush_dependency+0x11c/0x140
[ 40.969733] Call Trace:
[ 40.969740] __flush_work+0x97/0x1d0
[ 40.969745] ? wake_up_process+0x15/0x20
[ 40.969749] ? insert_work+0x70/0x80
[ 40.969750] ? __queue_work+0x14a/0x3e0
[ 40.969753] flush_work+0x10/0x20
[ 40.969756] rollback_registered_many+0x1c9/0x510
[ 40.969759] unregister_netdevice_queue+0x94/0x120
[ 40.969761] unregister_netdev+0x1d/0x30
[ 40.969765] mhi_net_remove+0x1a/0x40 [mhi_net]
[ 40.969770] mhi_driver_remove+0x124/0x250 [mhi]
[ 40.969776] device_release_driver_internal+0xf0/0x1d0
[ 40.969778] device_release_driver+0x12/0x20
[ 40.969782] bus_remove_device+0xe1/0x150
[ 40.969786] device_del+0x17b/0x3e0
[ 40.969791] mhi_destroy_device+0x9a/0x100 [mhi]
[ 40.969796] ? mhi_unmap_single_use_bb+0x50/0x50 [mhi]
[ 40.969799] device_for_each_child+0x5e/0xa0
[ 40.969804] mhi_pm_st_worker+0x921/0xf50 [mhi] |
| In the Linux kernel, the following vulnerability has been resolved:
irqchip/gic-v3: Do not enable irqs when handling spurious interrups
We triggered the following error while running our 4.19 kernel
with the pseudo-NMI patches backported to it:
[ 14.816231] ------------[ cut here ]------------
[ 14.816231] kernel BUG at irq.c:99!
[ 14.816232] Internal error: Oops - BUG: 0 [#1] SMP
[ 14.816232] Process swapper/0 (pid: 0, stack limit = 0x(____ptrval____))
[ 14.816233] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 4.19.95.aarch64 #14
[ 14.816233] Hardware name: evb (DT)
[ 14.816234] pstate: 80400085 (Nzcv daIf +PAN -UAO)
[ 14.816234] pc : asm_nmi_enter+0x94/0x98
[ 14.816235] lr : asm_nmi_enter+0x18/0x98
[ 14.816235] sp : ffff000008003c50
[ 14.816235] pmr_save: 00000070
[ 14.816237] x29: ffff000008003c50 x28: ffff0000095f56c0
[ 14.816238] x27: 0000000000000000 x26: ffff000008004000
[ 14.816239] x25: 00000000015e0000 x24: ffff8008fb916000
[ 14.816240] x23: 0000000020400005 x22: ffff0000080817cc
[ 14.816241] x21: ffff000008003da0 x20: 0000000000000060
[ 14.816242] x19: 00000000000003ff x18: ffffffffffffffff
[ 14.816243] x17: 0000000000000008 x16: 003d090000000000
[ 14.816244] x15: ffff0000095ea6c8 x14: ffff8008fff5ab40
[ 14.816244] x13: ffff8008fff58b9d x12: 0000000000000000
[ 14.816245] x11: ffff000008c8a200 x10: 000000008e31fca5
[ 14.816246] x9 : ffff000008c8a208 x8 : 000000000000000f
[ 14.816247] x7 : 0000000000000004 x6 : ffff8008fff58b9e
[ 14.816248] x5 : 0000000000000000 x4 : 0000000080000000
[ 14.816249] x3 : 0000000000000000 x2 : 0000000080000000
[ 14.816250] x1 : 0000000000120000 x0 : ffff0000095f56c0
[ 14.816251] Call trace:
[ 14.816251] asm_nmi_enter+0x94/0x98
[ 14.816251] el1_irq+0x8c/0x180 (IRQ C)
[ 14.816252] gic_handle_irq+0xbc/0x2e4
[ 14.816252] el1_irq+0xcc/0x180 (IRQ B)
[ 14.816253] arch_timer_handler_virt+0x38/0x58
[ 14.816253] handle_percpu_devid_irq+0x90/0x240
[ 14.816253] generic_handle_irq+0x34/0x50
[ 14.816254] __handle_domain_irq+0x68/0xc0
[ 14.816254] gic_handle_irq+0xf8/0x2e4
[ 14.816255] el1_irq+0xcc/0x180 (IRQ A)
[ 14.816255] arch_cpu_idle+0x34/0x1c8
[ 14.816255] default_idle_call+0x24/0x44
[ 14.816256] do_idle+0x1d0/0x2c8
[ 14.816256] cpu_startup_entry+0x28/0x30
[ 14.816256] rest_init+0xb8/0xc8
[ 14.816257] start_kernel+0x4c8/0x4f4
[ 14.816257] Code: 940587f1 d5384100 b9401001 36a7fd01 (d4210000)
[ 14.816258] Modules linked in: start_dp(O) smeth(O)
[ 15.103092] ---[ end trace 701753956cb14aa8 ]---
[ 15.103093] Kernel panic - not syncing: Fatal exception in interrupt
[ 15.103099] SMP: stopping secondary CPUs
[ 15.103100] Kernel Offset: disabled
[ 15.103100] CPU features: 0x36,a2400218
[ 15.103100] Memory Limit: none
which is cause by a 'BUG_ON(in_nmi())' in nmi_enter().
From the call trace, we can find three interrupts (noted A, B, C above):
interrupt (A) is preempted by (B), which is further interrupted by (C).
Subsequent investigations show that (B) results in nmi_enter() being
called, but that it actually is a spurious interrupt. Furthermore,
interrupts are reenabled in the context of (B), and (C) fires with
NMI priority. We end-up with a nested NMI situation, something
we definitely do not want to (and cannot) handle.
The bug here is that spurious interrupts should never result in any
state change, and we should just return to the interrupted context.
Moving the handling of spurious interrupts as early as possible in
the GICv3 handler fixes this issue.
[maz: rewrote commit message, corrected Fixes: tag] |
| In the Linux kernel, the following vulnerability has been resolved:
riscv/kprobe: fix kernel panic when invoking sys_read traced by kprobe
The execution of sys_read end up hitting a BUG_ON() in __find_get_block
after installing kprobe at sys_read, the BUG message like the following:
[ 65.708663] ------------[ cut here ]------------
[ 65.709987] kernel BUG at fs/buffer.c:1251!
[ 65.711283] Kernel BUG [#1]
[ 65.712032] Modules linked in:
[ 65.712925] CPU: 0 PID: 51 Comm: sh Not tainted 5.12.0-rc4 #1
[ 65.714407] Hardware name: riscv-virtio,qemu (DT)
[ 65.715696] epc : __find_get_block+0x218/0x2c8
[ 65.716835] ra : __getblk_gfp+0x1c/0x4a
[ 65.717831] epc : ffffffe00019f11e ra : ffffffe00019f56a sp : ffffffe002437930
[ 65.719553] gp : ffffffe000f06030 tp : ffffffe0015abc00 t0 : ffffffe00191e038
[ 65.721290] t1 : ffffffe00191e038 t2 : 000000000000000a s0 : ffffffe002437960
[ 65.723051] s1 : ffffffe00160ad00 a0 : ffffffe00160ad00 a1 : 000000000000012a
[ 65.724772] a2 : 0000000000000400 a3 : 0000000000000008 a4 : 0000000000000040
[ 65.726545] a5 : 0000000000000000 a6 : ffffffe00191e000 a7 : 0000000000000000
[ 65.728308] s2 : 000000000000012a s3 : 0000000000000400 s4 : 0000000000000008
[ 65.730049] s5 : 000000000000006c s6 : ffffffe00240f800 s7 : ffffffe000f080a8
[ 65.731802] s8 : 0000000000000001 s9 : 000000000000012a s10: 0000000000000008
[ 65.733516] s11: 0000000000000008 t3 : 00000000000003ff t4 : 000000000000000f
[ 65.734434] t5 : 00000000000003ff t6 : 0000000000040000
[ 65.734613] status: 0000000000000100 badaddr: 0000000000000000 cause: 0000000000000003
[ 65.734901] Call Trace:
[ 65.735076] [<ffffffe00019f11e>] __find_get_block+0x218/0x2c8
[ 65.735417] [<ffffffe00020017a>] __ext4_get_inode_loc+0xb2/0x2f6
[ 65.735618] [<ffffffe000201b6c>] ext4_get_inode_loc+0x3a/0x8a
[ 65.735802] [<ffffffe000203380>] ext4_reserve_inode_write+0x2e/0x8c
[ 65.735999] [<ffffffe00020357a>] __ext4_mark_inode_dirty+0x4c/0x18e
[ 65.736208] [<ffffffe000206bb0>] ext4_dirty_inode+0x46/0x66
[ 65.736387] [<ffffffe000192914>] __mark_inode_dirty+0x12c/0x3da
[ 65.736576] [<ffffffe000180dd2>] touch_atime+0x146/0x150
[ 65.736748] [<ffffffe00010d762>] filemap_read+0x234/0x246
[ 65.736920] [<ffffffe00010d834>] generic_file_read_iter+0xc0/0x114
[ 65.737114] [<ffffffe0001f5d7a>] ext4_file_read_iter+0x42/0xea
[ 65.737310] [<ffffffe000163f2c>] new_sync_read+0xe2/0x15a
[ 65.737483] [<ffffffe000165814>] vfs_read+0xca/0xf2
[ 65.737641] [<ffffffe000165bae>] ksys_read+0x5e/0xc8
[ 65.737816] [<ffffffe000165c26>] sys_read+0xe/0x16
[ 65.737973] [<ffffffe000003972>] ret_from_syscall+0x0/0x2
[ 65.738858] ---[ end trace fe93f985456c935d ]---
A simple reproducer looks like:
echo 'p:myprobe sys_read fd=%a0 buf=%a1 count=%a2' > /sys/kernel/debug/tracing/kprobe_events
echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable
cat /sys/kernel/debug/tracing/trace
Here's what happens to hit that BUG_ON():
1) After installing kprobe at entry of sys_read, the first instruction
is replaced by 'ebreak' instruction on riscv64 platform.
2) Once kernel reach the 'ebreak' instruction at the entry of sys_read,
it trap into the riscv breakpoint handler, where it do something to
setup for coming single-step of origin instruction, including backup
the 'sstatus' in pt_regs, followed by disable interrupt during single
stepping via clear 'SIE' bit of 'sstatus' in pt_regs.
3) Then kernel restore to the instruction slot contains two instructions,
one is original instruction at entry of sys_read, the other is 'ebreak'.
Here it trigger a 'Instruction page fault' exception (value at 'scause'
is '0xc'), if PF is not filled into PageTabe for that slot yet.
4) Again kernel trap into page fault exception handler, where it choose
different policy according to the state of running kprobe. Because
afte 2) the state is KPROBE_HIT_SS, so kernel reset the current kp
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
ACPI: GTDT: Don't corrupt interrupt mappings on watchdow probe failure
When failing the driver probe because of invalid firmware properties,
the GTDT driver unmaps the interrupt that it mapped earlier.
However, it never checks whether the mapping of the interrupt actially
succeeded. Even more, should the firmware report an illegal interrupt
number that overlaps with the GIC SGI range, this can result in an
IPI being unmapped, and subsequent fireworks (as reported by Dann
Frazier).
Rework the driver to have a slightly saner behaviour and actually
check whether the interrupt has been mapped before unmapping things. |
| In the Linux kernel, the following vulnerability has been resolved:
ext4: always panic when errors=panic is specified
Before commit 014c9caa29d3 ("ext4: make ext4_abort() use
__ext4_error()"), the following series of commands would trigger a
panic:
1. mount /dev/sda -o ro,errors=panic test
2. mount /dev/sda -o remount,abort test
After commit 014c9caa29d3, remounting a file system using the test
mount option "abort" will no longer trigger a panic. This commit will
restore the behaviour immediately before commit 014c9caa29d3.
(However, note that the Linux kernel's behavior has not been
consistent; some previous kernel versions, including 5.4 and 4.19
similarly did not panic after using the mount option "abort".)
This also makes a change to long-standing behaviour; namely, the
following series commands will now cause a panic, when previously it
did not:
1. mount /dev/sda -o ro,errors=panic test
2. echo test > /sys/fs/ext4/sda/trigger_fs_error
However, this makes ext4's behaviour much more consistent, so this is
a good thing. |