| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
RDMA/rxe: Fix seg fault in rxe_comp_queue_pkt
In rxe_comp_queue_pkt() an incoming response packet skb is enqueued to the
resp_pkts queue and then a decision is made whether to run the completer
task inline or schedule it. Finally the skb is dereferenced to bump a 'hw'
performance counter. This is wrong because if the completer task is
already running in a separate thread it may have already processed the skb
and freed it which can cause a seg fault. This has been observed
infrequently in testing at high scale.
This patch fixes this by changing the order of enqueuing the packet until
after the counter is accessed. |
| In the Linux kernel, the following vulnerability has been resolved:
net: bridge: xmit: make sure we have at least eth header len bytes
syzbot triggered an uninit value[1] error in bridge device's xmit path
by sending a short (less than ETH_HLEN bytes) skb. To fix it check if
we can actually pull that amount instead of assuming.
Tested with dropwatch:
drop at: br_dev_xmit+0xb93/0x12d0 [bridge] (0xffffffffc06739b3)
origin: software
timestamp: Mon May 13 11:31:53 2024 778214037 nsec
protocol: 0x88a8
length: 2
original length: 2
drop reason: PKT_TOO_SMALL
[1]
BUG: KMSAN: uninit-value in br_dev_xmit+0x61d/0x1cb0 net/bridge/br_device.c:65
br_dev_xmit+0x61d/0x1cb0 net/bridge/br_device.c:65
__netdev_start_xmit include/linux/netdevice.h:4903 [inline]
netdev_start_xmit include/linux/netdevice.h:4917 [inline]
xmit_one net/core/dev.c:3531 [inline]
dev_hard_start_xmit+0x247/0xa20 net/core/dev.c:3547
__dev_queue_xmit+0x34db/0x5350 net/core/dev.c:4341
dev_queue_xmit include/linux/netdevice.h:3091 [inline]
__bpf_tx_skb net/core/filter.c:2136 [inline]
__bpf_redirect_common net/core/filter.c:2180 [inline]
__bpf_redirect+0x14a6/0x1620 net/core/filter.c:2187
____bpf_clone_redirect net/core/filter.c:2460 [inline]
bpf_clone_redirect+0x328/0x470 net/core/filter.c:2432
___bpf_prog_run+0x13fe/0xe0f0 kernel/bpf/core.c:1997
__bpf_prog_run512+0xb5/0xe0 kernel/bpf/core.c:2238
bpf_dispatcher_nop_func include/linux/bpf.h:1234 [inline]
__bpf_prog_run include/linux/filter.h:657 [inline]
bpf_prog_run include/linux/filter.h:664 [inline]
bpf_test_run+0x499/0xc30 net/bpf/test_run.c:425
bpf_prog_test_run_skb+0x14ea/0x1f20 net/bpf/test_run.c:1058
bpf_prog_test_run+0x6b7/0xad0 kernel/bpf/syscall.c:4269
__sys_bpf+0x6aa/0xd90 kernel/bpf/syscall.c:5678
__do_sys_bpf kernel/bpf/syscall.c:5767 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5765 [inline]
__x64_sys_bpf+0xa0/0xe0 kernel/bpf/syscall.c:5765
x64_sys_call+0x96b/0x3b50 arch/x86/include/generated/asm/syscalls_64.h:322
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f |
| In the Linux kernel, the following vulnerability has been resolved:
gpiolib: cdev: Fix use after free in lineinfo_changed_notify
The use-after-free issue occurs as follows: when the GPIO chip device file
is being closed by invoking gpio_chrdev_release(), watched_lines is freed
by bitmap_free(), but the unregistration of lineinfo_changed_nb notifier
chain failed due to waiting write rwsem. Additionally, one of the GPIO
chip's lines is also in the release process and holds the notifier chain's
read rwsem. Consequently, a race condition leads to the use-after-free of
watched_lines.
Here is the typical stack when issue happened:
[free]
gpio_chrdev_release()
--> bitmap_free(cdev->watched_lines) <-- freed
--> blocking_notifier_chain_unregister()
--> down_write(&nh->rwsem) <-- waiting rwsem
--> __down_write_common()
--> rwsem_down_write_slowpath()
--> schedule_preempt_disabled()
--> schedule()
[use]
st54spi_gpio_dev_release()
--> gpio_free()
--> gpiod_free()
--> gpiod_free_commit()
--> gpiod_line_state_notify()
--> blocking_notifier_call_chain()
--> down_read(&nh->rwsem); <-- held rwsem
--> notifier_call_chain()
--> lineinfo_changed_notify()
--> test_bit(xxxx, cdev->watched_lines) <-- use after free
The side effect of the use-after-free issue is that a GPIO line event is
being generated for userspace where it shouldn't. However, since the chrdev
is being closed, userspace won't have the chance to read that event anyway.
To fix the issue, call the bitmap_free() function after the unregistration
of lineinfo_changed_nb notifier chain. |
| In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: L2CAP: Fix not validating setsockopt user input
Check user input length before copying data. |
| In the Linux kernel, the following vulnerability has been resolved:
inet: inet_defrag: prevent sk release while still in use
ip_local_out() and other functions can pass skb->sk as function argument.
If the skb is a fragment and reassembly happens before such function call
returns, the sk must not be released.
This affects skb fragments reassembled via netfilter or similar
modules, e.g. openvswitch or ct_act.c, when run as part of tx pipeline.
Eric Dumazet made an initial analysis of this bug. Quoting Eric:
Calling ip_defrag() in output path is also implying skb_orphan(),
which is buggy because output path relies on sk not disappearing.
A relevant old patch about the issue was :
8282f27449bf ("inet: frag: Always orphan skbs inside ip_defrag()")
[..]
net/ipv4/ip_output.c depends on skb->sk being set, and probably to an
inet socket, not an arbitrary one.
If we orphan the packet in ipvlan, then downstream things like FQ
packet scheduler will not work properly.
We need to change ip_defrag() to only use skb_orphan() when really
needed, ie whenever frag_list is going to be used.
Eric suggested to stash sk in fragment queue and made an initial patch.
However there is a problem with this:
If skb is refragmented again right after, ip_do_fragment() will copy
head->sk to the new fragments, and sets up destructor to sock_wfree.
IOW, we have no choice but to fix up sk_wmem accouting to reflect the
fully reassembled skb, else wmem will underflow.
This change moves the orphan down into the core, to last possible moment.
As ip_defrag_offset is aliased with sk_buff->sk member, we must move the
offset into the FRAG_CB, else skb->sk gets clobbered.
This allows to delay the orphaning long enough to learn if the skb has
to be queued or if the skb is completing the reasm queue.
In the former case, things work as before, skb is orphaned. This is
safe because skb gets queued/stolen and won't continue past reasm engine.
In the latter case, we will steal the skb->sk reference, reattach it to
the head skb, and fix up wmem accouting when inet_frag inflates truesize. |
| In the Linux kernel, the following vulnerability has been resolved:
mlxsw: spectrum_acl_tcam: Fix NULL pointer dereference in error path
When calling mlxsw_sp_acl_tcam_region_destroy() from an error path after
failing to attach the region to an ACL group, we hit a NULL pointer
dereference upon 'region->group->tcam' [1].
Fix by retrieving the 'tcam' pointer using mlxsw_sp_acl_to_tcam().
[1]
BUG: kernel NULL pointer dereference, address: 0000000000000000
[...]
RIP: 0010:mlxsw_sp_acl_tcam_region_destroy+0xa0/0xd0
[...]
Call Trace:
mlxsw_sp_acl_tcam_vchunk_get+0x88b/0xa20
mlxsw_sp_acl_tcam_ventry_add+0x25/0xe0
mlxsw_sp_acl_rule_add+0x47/0x240
mlxsw_sp_flower_replace+0x1a9/0x1d0
tc_setup_cb_add+0xdc/0x1c0
fl_hw_replace_filter+0x146/0x1f0
fl_change+0xc17/0x1360
tc_new_tfilter+0x472/0xb90
rtnetlink_rcv_msg+0x313/0x3b0
netlink_rcv_skb+0x58/0x100
netlink_unicast+0x244/0x390
netlink_sendmsg+0x1e4/0x440
____sys_sendmsg+0x164/0x260
___sys_sendmsg+0x9a/0xe0
__sys_sendmsg+0x7a/0xc0
do_syscall_64+0x40/0xe0
entry_SYSCALL_64_after_hwframe+0x63/0x6b |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: mac80211: fix potential key use-after-free
When ieee80211_key_link() is called by ieee80211_gtk_rekey_add()
but returns 0 due to KRACK protection (identical key reinstall),
ieee80211_gtk_rekey_add() will still return a pointer into the
key, in a potential use-after-free. This normally doesn't happen
since it's only called by iwlwifi in case of WoWLAN rekey offload
which has its own KRACK protection, but still better to fix, do
that by returning an error code and converting that to success on
the cfg80211 boundary only, leaving the error for bad callers of
ieee80211_gtk_rekey_add(). |
| In the Linux kernel, the following vulnerability has been resolved:
ext4: fix OOB read when checking dotdot dir
Mounting a corrupted filesystem with directory which contains '.' dir
entry with rec_len == block size results in out-of-bounds read (later
on, when the corrupted directory is removed).
ext4_empty_dir() assumes every ext4 directory contains at least '.'
and '..' as directory entries in the first data block. It first loads
the '.' dir entry, performs sanity checks by calling ext4_check_dir_entry()
and then uses its rec_len member to compute the location of '..' dir
entry (in ext4_next_entry). It assumes the '..' dir entry fits into the
same data block.
If the rec_len of '.' is precisely one block (4KB), it slips through the
sanity checks (it is considered the last directory entry in the data
block) and leaves "struct ext4_dir_entry_2 *de" point exactly past the
memory slot allocated to the data block. The following call to
ext4_check_dir_entry() on new value of de then dereferences this pointer
which results in out-of-bounds mem access.
Fix this by extending __ext4_check_dir_entry() to check for '.' dir
entries that reach the end of data block. Make sure to ignore the phony
dir entries for checksum (by checking name_len for non-zero).
Note: This is reported by KASAN as use-after-free in case another
structure was recently freed from the slot past the bound, but it is
really an OOB read.
This issue was found by syzkaller tool.
Call Trace:
[ 38.594108] BUG: KASAN: slab-use-after-free in __ext4_check_dir_entry+0x67e/0x710
[ 38.594649] Read of size 2 at addr ffff88802b41a004 by task syz-executor/5375
[ 38.595158]
[ 38.595288] CPU: 0 UID: 0 PID: 5375 Comm: syz-executor Not tainted 6.14.0-rc7 #1
[ 38.595298] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 38.595304] Call Trace:
[ 38.595308] <TASK>
[ 38.595311] dump_stack_lvl+0xa7/0xd0
[ 38.595325] print_address_description.constprop.0+0x2c/0x3f0
[ 38.595339] ? __ext4_check_dir_entry+0x67e/0x710
[ 38.595349] print_report+0xaa/0x250
[ 38.595359] ? __ext4_check_dir_entry+0x67e/0x710
[ 38.595368] ? kasan_addr_to_slab+0x9/0x90
[ 38.595378] kasan_report+0xab/0xe0
[ 38.595389] ? __ext4_check_dir_entry+0x67e/0x710
[ 38.595400] __ext4_check_dir_entry+0x67e/0x710
[ 38.595410] ext4_empty_dir+0x465/0x990
[ 38.595421] ? __pfx_ext4_empty_dir+0x10/0x10
[ 38.595432] ext4_rmdir.part.0+0x29a/0xd10
[ 38.595441] ? __dquot_initialize+0x2a7/0xbf0
[ 38.595455] ? __pfx_ext4_rmdir.part.0+0x10/0x10
[ 38.595464] ? __pfx___dquot_initialize+0x10/0x10
[ 38.595478] ? down_write+0xdb/0x140
[ 38.595487] ? __pfx_down_write+0x10/0x10
[ 38.595497] ext4_rmdir+0xee/0x140
[ 38.595506] vfs_rmdir+0x209/0x670
[ 38.595517] ? lookup_one_qstr_excl+0x3b/0x190
[ 38.595529] do_rmdir+0x363/0x3c0
[ 38.595537] ? __pfx_do_rmdir+0x10/0x10
[ 38.595544] ? strncpy_from_user+0x1ff/0x2e0
[ 38.595561] __x64_sys_unlinkat+0xf0/0x130
[ 38.595570] do_syscall_64+0x5b/0x180
[ 38.595583] entry_SYSCALL_64_after_hwframe+0x76/0x7e |
| In the Linux kernel, the following vulnerability has been resolved:
md: fix mddev uaf while iterating all_mddevs list
While iterating all_mddevs list from md_notify_reboot() and md_exit(),
list_for_each_entry_safe is used, and this can race with deletint the
next mddev, causing UAF:
t1:
spin_lock
//list_for_each_entry_safe(mddev, n, ...)
mddev_get(mddev1)
// assume mddev2 is the next entry
spin_unlock
t2:
//remove mddev2
...
mddev_free
spin_lock
list_del
spin_unlock
kfree(mddev2)
mddev_put(mddev1)
spin_lock
//continue dereference mddev2->all_mddevs
The old helper for_each_mddev() actually grab the reference of mddev2
while holding the lock, to prevent from being freed. This problem can be
fixed the same way, however, the code will be complex.
Hence switch to use list_for_each_entry, in this case mddev_put() can free
the mddev1 and it's not safe as well. Refer to md_seq_show(), also factor
out a helper mddev_put_locked() to fix this problem. |
| In the Linux kernel, the following vulnerability has been resolved:
net: fix geneve_opt length integer overflow
struct geneve_opt uses 5 bit length for each single option, which
means every vary size option should be smaller than 128 bytes.
However, all current related Netlink policies cannot promise this
length condition and the attacker can exploit a exact 128-byte size
option to *fake* a zero length option and confuse the parsing logic,
further achieve heap out-of-bounds read.
One example crash log is like below:
[ 3.905425] ==================================================================
[ 3.905925] BUG: KASAN: slab-out-of-bounds in nla_put+0xa9/0xe0
[ 3.906255] Read of size 124 at addr ffff888005f291cc by task poc/177
[ 3.906646]
[ 3.906775] CPU: 0 PID: 177 Comm: poc-oob-read Not tainted 6.1.132 #1
[ 3.907131] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 3.907784] Call Trace:
[ 3.907925] <TASK>
[ 3.908048] dump_stack_lvl+0x44/0x5c
[ 3.908258] print_report+0x184/0x4be
[ 3.909151] kasan_report+0xc5/0x100
[ 3.909539] kasan_check_range+0xf3/0x1a0
[ 3.909794] memcpy+0x1f/0x60
[ 3.909968] nla_put+0xa9/0xe0
[ 3.910147] tunnel_key_dump+0x945/0xba0
[ 3.911536] tcf_action_dump_1+0x1c1/0x340
[ 3.912436] tcf_action_dump+0x101/0x180
[ 3.912689] tcf_exts_dump+0x164/0x1e0
[ 3.912905] fw_dump+0x18b/0x2d0
[ 3.913483] tcf_fill_node+0x2ee/0x460
[ 3.914778] tfilter_notify+0xf4/0x180
[ 3.915208] tc_new_tfilter+0xd51/0x10d0
[ 3.918615] rtnetlink_rcv_msg+0x4a2/0x560
[ 3.919118] netlink_rcv_skb+0xcd/0x200
[ 3.919787] netlink_unicast+0x395/0x530
[ 3.921032] netlink_sendmsg+0x3d0/0x6d0
[ 3.921987] __sock_sendmsg+0x99/0xa0
[ 3.922220] __sys_sendto+0x1b7/0x240
[ 3.922682] __x64_sys_sendto+0x72/0x90
[ 3.922906] do_syscall_64+0x5e/0x90
[ 3.923814] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 3.924122] RIP: 0033:0x7e83eab84407
[ 3.924331] Code: 48 89 fa 4c 89 df e8 38 aa 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 faf
[ 3.925330] RSP: 002b:00007ffff505e370 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
[ 3.925752] RAX: ffffffffffffffda RBX: 00007e83eaafa740 RCX: 00007e83eab84407
[ 3.926173] RDX: 00000000000001a8 RSI: 00007ffff505e3c0 RDI: 0000000000000003
[ 3.926587] RBP: 00007ffff505f460 R08: 00007e83eace1000 R09: 000000000000000c
[ 3.926977] R10: 0000000000000000 R11: 0000000000000202 R12: 00007ffff505f3c0
[ 3.927367] R13: 00007ffff505f5c8 R14: 00007e83ead1b000 R15: 00005d4fbbe6dcb8
Fix these issues by enforing correct length condition in related
policies. |
| In the Linux kernel, the following vulnerability has been resolved:
net: atm: fix use after free in lec_send()
The ->send() operation frees skb so save the length before calling
->send() to avoid a use after free. |
| In the Linux kernel, the following vulnerability has been resolved:
proc: fix UAF in proc_get_inode()
Fix race between rmmod and /proc/XXX's inode instantiation.
The bug is that pde->proc_ops don't belong to /proc, it belongs to a
module, therefore dereferencing it after /proc entry has been registered
is a bug unless use_pde/unuse_pde() pair has been used.
use_pde/unuse_pde can be avoided (2 atomic ops!) because pde->proc_ops
never changes so information necessary for inode instantiation can be
saved _before_ proc_register() in PDE itself and used later, avoiding
pde->proc_ops->... dereference.
rmmod lookup
sys_delete_module
proc_lookup_de
pde_get(de);
proc_get_inode(dir->i_sb, de);
mod->exit()
proc_remove
remove_proc_subtree
proc_entry_rundown(de);
free_module(mod);
if (S_ISREG(inode->i_mode))
if (de->proc_ops->proc_read_iter)
--> As module is already freed, will trigger UAF
BUG: unable to handle page fault for address: fffffbfff80a702b
PGD 817fc4067 P4D 817fc4067 PUD 817fc0067 PMD 102ef4067 PTE 0
Oops: Oops: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 26 UID: 0 PID: 2667 Comm: ls Tainted: G
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
RIP: 0010:proc_get_inode+0x302/0x6e0
RSP: 0018:ffff88811c837998 EFLAGS: 00010a06
RAX: dffffc0000000000 RBX: ffffffffc0538140 RCX: 0000000000000007
RDX: 1ffffffff80a702b RSI: 0000000000000001 RDI: ffffffffc0538158
RBP: ffff8881299a6000 R08: 0000000067bbe1e5 R09: 1ffff11023906f20
R10: ffffffffb560ca07 R11: ffffffffb2b43a58 R12: ffff888105bb78f0
R13: ffff888100518048 R14: ffff8881299a6004 R15: 0000000000000001
FS: 00007f95b9686840(0000) GS:ffff8883af100000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff80a702b CR3: 0000000117dd2000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
proc_lookup_de+0x11f/0x2e0
__lookup_slow+0x188/0x350
walk_component+0x2ab/0x4f0
path_lookupat+0x120/0x660
filename_lookup+0x1ce/0x560
vfs_statx+0xac/0x150
__do_sys_newstat+0x96/0x110
do_syscall_64+0x5f/0x170
entry_SYSCALL_64_after_hwframe+0x76/0x7e
[adobriyan@gmail.com: don't do 2 atomic ops on the common path] |
| In the Linux kernel, the following vulnerability has been resolved:
xsk: fix an integer overflow in xp_create_and_assign_umem()
Since the i and pool->chunk_size variables are of type 'u32',
their product can wrap around and then be cast to 'u64'.
This can lead to two different XDP buffers pointing to the same
memory area.
Found by InfoTeCS on behalf of Linux Verification Center
(linuxtesting.org) with SVACE. |
| In the Linux kernel, the following vulnerability has been resolved:
x86/microcode/AMD: Fix out-of-bounds on systems with CPU-less NUMA nodes
Currently, load_microcode_amd() iterates over all NUMA nodes, retrieves their
CPU masks and unconditionally accesses per-CPU data for the first CPU of each
mask.
According to Documentation/admin-guide/mm/numaperf.rst:
"Some memory may share the same node as a CPU, and others are provided as
memory only nodes."
Therefore, some node CPU masks may be empty and wouldn't have a "first CPU".
On a machine with far memory (and therefore CPU-less NUMA nodes):
- cpumask_of_node(nid) is 0
- cpumask_first(0) is CONFIG_NR_CPUS
- cpu_data(CONFIG_NR_CPUS) accesses the cpu_info per-CPU array at an
index that is 1 out of bounds
This does not have any security implications since flashing microcode is
a privileged operation but I believe this has reliability implications by
potentially corrupting memory while flashing a microcode update.
When booting with CONFIG_UBSAN_BOUNDS=y on an AMD machine that flashes
a microcode update. I get the following splat:
UBSAN: array-index-out-of-bounds in arch/x86/kernel/cpu/microcode/amd.c:X:Y
index 512 is out of range for type 'unsigned long[512]'
[...]
Call Trace:
dump_stack
__ubsan_handle_out_of_bounds
load_microcode_amd
request_microcode_amd
reload_store
kernfs_fop_write_iter
vfs_write
ksys_write
do_syscall_64
entry_SYSCALL_64_after_hwframe
Change the loop to go over only NUMA nodes which have CPUs before determining
whether the first CPU on the respective node needs microcode update.
[ bp: Massage commit message, fix typo. ] |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: cfg80211: cancel wiphy_work before freeing wiphy
A wiphy_work can be queued from the moment the wiphy is allocated and
initialized (i.e. wiphy_new_nm). When a wiphy_work is queued, the
rdev::wiphy_work is getting queued.
If wiphy_free is called before the rdev::wiphy_work had a chance to run,
the wiphy memory will be freed, and then when it eventally gets to run
it'll use invalid memory.
Fix this by canceling the work before freeing the wiphy. |
| In the Linux kernel, the following vulnerability has been resolved:
cifs: Fix integer overflow while processing acregmax mount option
User-provided mount parameter acregmax of type u32 is intended to have
an upper limit, but before it is validated, the value is converted from
seconds to jiffies which can lead to an integer overflow.
Found by Linux Verification Center (linuxtesting.org) with SVACE. |
| In the Linux kernel, the following vulnerability has been resolved:
cifs: Fix integer overflow while processing acdirmax mount option
User-provided mount parameter acdirmax of type u32 is intended to have
an upper limit, but before it is validated, the value is converted from
seconds to jiffies which can lead to an integer overflow.
Found by Linux Verification Center (linuxtesting.org) with SVACE. |
| In the Linux kernel, the following vulnerability has been resolved:
cifs: Fix integer overflow while processing closetimeo mount option
User-provided mount parameter closetimeo of type u32 is intended to have
an upper limit, but before it is validated, the value is converted from
seconds to jiffies which can lead to an integer overflow.
Found by Linux Verification Center (linuxtesting.org) with SVACE. |
| In the Linux kernel, the following vulnerability has been resolved:
net: gso: fix ownership in __udp_gso_segment
In __udp_gso_segment the skb destructor is removed before segmenting the
skb but the socket reference is kept as-is. This is an issue if the
original skb is later orphaned as we can hit the following bug:
kernel BUG at ./include/linux/skbuff.h:3312! (skb_orphan)
RIP: 0010:ip_rcv_core+0x8b2/0xca0
Call Trace:
ip_rcv+0xab/0x6e0
__netif_receive_skb_one_core+0x168/0x1b0
process_backlog+0x384/0x1100
__napi_poll.constprop.0+0xa1/0x370
net_rx_action+0x925/0xe50
The above can happen following a sequence of events when using
OpenVSwitch, when an OVS_ACTION_ATTR_USERSPACE action precedes an
OVS_ACTION_ATTR_OUTPUT action:
1. OVS_ACTION_ATTR_USERSPACE is handled (in do_execute_actions): the skb
goes through queue_gso_packets and then __udp_gso_segment, where its
destructor is removed.
2. The segments' data are copied and sent to userspace.
3. OVS_ACTION_ATTR_OUTPUT is handled (in do_execute_actions) and the
same original skb is sent to its path.
4. If it later hits skb_orphan, we hit the bug.
Fix this by also removing the reference to the socket in
__udp_gso_segment. |
| In the Linux kernel, the following vulnerability has been resolved:
vlan: enforce underlying device type
Currently, VLAN devices can be created on top of non-ethernet devices.
Besides the fact that it doesn't make much sense, this also causes a
bug which leaks the address of a kernel function to usermode.
When creating a VLAN device, we initialize GARP (garp_init_applicant)
and MRP (mrp_init_applicant) for the underlying device.
As part of the initialization process, we add the multicast address of
each applicant to the underlying device, by calling dev_mc_add.
__dev_mc_add uses dev->addr_len to determine the length of the new
multicast address.
This causes an out-of-bounds read if dev->addr_len is greater than 6,
since the multicast addresses provided by GARP and MRP are only 6
bytes long.
This behaviour can be reproduced using the following commands:
ip tunnel add gretest mode ip6gre local ::1 remote ::2 dev lo
ip l set up dev gretest
ip link add link gretest name vlantest type vlan id 100
Then, the following command will display the address of garp_pdu_rcv:
ip maddr show | grep 01:80:c2:00:00:21
Fix the bug by enforcing the type of the underlying device during VLAN
device initialization. |