【标题描述】能够简要描述问题:说明什么场景下,做了什么操作,出现什么问题(尽量使用正向表达方式)
4.18内核OS版本长稳测试,perf_event_detach_bpf_prog出现uaf问题,4.19经排查需要同步修复该问题
【环境信息】
硬件信息:
1) 裸机场景提供出问题的硬件信息;
2) 虚机场景提供虚机XML文件或者配置信息
虚拟机和裸机场景均有问题
软件信息:
1) OS版本及分支
2) 内核信息
3) 发现问题的组件版本信息
openEuler-1.0-LTS
如果有特殊组网,请提供网络拓扑图
【问题复现步骤】
具体操作步骤
长稳测试
出现概率(是否必现,概率性错误)
概率性错误
【预期结果】
描述预期结果,可以通过对比新老版本获取
系统正常
【实际结果】
描述出问题的结果
4.18内核perf_event_detach_bpf_prog出现uaf
【附件信息】
比如系统message日志/组件日志、dump信息、图片等
4.18内核问题日志:
Apr 14 12:22:27 euleros-pxe kernel: [ 8071.732140] Using feature eBPF/event.
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.044984] ==================================================================
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.052325] BUG: KASAN: use-after-free in perf_event_detach_bpf_prog+0xb6/0x1c0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.059735] Read of size 8 at addr ffff88b1cefdddd8 by task bpftrace/1416068
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.066847]
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068421] CPU: 56 PID: 1416068 Comm: bpftrace Kdump: loaded Not tainted 4.18.0-147.5.2.8.h878.kasan.eulerosv2r10.x86_64 #1
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068425] Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 1.57 08/11/2015
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068427] Call Trace:
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068442] dump_stack+0xc2/0x12e
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068452] print_address_description+0x70/0x360
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068461] ? vprintk_func+0x5e/0x100
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068467] kasan_report+0x1b2/0x330
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068475] ? perf_event_detach_bpf_prog+0xb6/0x1c0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068480] ? perf_event_detach_bpf_prog+0xb6/0x1c0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068486] perf_event_detach_bpf_prog+0xb6/0x1c0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068492] ? perf_event_attach_bpf_prog+0x210/0x210
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068500] ? __queue_delayed_work+0xf0/0x170
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068507] _free_event+0x18f/0x8d0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068512] put_event+0x31/0x40
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068518] perf_event_release_kernel+0x3a9/0x620
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068525] ? __perf_event_exit_context+0xb0/0xb0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068534] ? propagate_protected_usage+0x26/0x1e0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068539] ? perf_event_release_kernel+0x620/0x620
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068544] perf_release+0x21/0x30
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068551] __fput+0x180/0x430
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068558] task_work_run+0x101/0x140
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068568] exit_to_usermode_loop+0x157/0x160
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068574] do_syscall_64+0x28e/0x2d0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068583] entry_SYSCALL_64_after_hwframe+0x65/0xca
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068588] RIP: 0033:0x7f1cfac7d9ab
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068597] Code: 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 c3 ae f8 ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 89 44 24 0c e8 11 af f8 ff 8b 44
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068601] RSP: 002b:00007ffc4ebf1180 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068607] RAX: 0000000000000000 RBX: 000000000000005f RCX: 00007f1cfac7d9ab
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068610] RDX: 0000000000000000 RSI: 0000000000002401 RDI: 000000000000005f
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068614] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000564874c4e314
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068617] R10: 00000000ffffffff R11: 0000000000000293 R12: 0000564874c50ec0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068621] R13: 0000564874c4e314 R14: 00007ffc4ebf12c0 R15: 0000564874dde300
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.068624]
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.070197] Allocated by task 1416068:
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.074037] kasan_kmalloc+0xa0/0xd0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.074049] __kmalloc+0x11f/0x290
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.074061] alloc_trace_uprobe+0x171/0x310
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.074071] create_local_trace_uprobe+0x121/0x2a0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.074080] perf_uprobe_init+0xfa/0x180
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.074091] perf_uprobe_event_init+0x78/0xc0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.074100] perf_try_init_event+0x80/0x270
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.074111] perf_event_alloc.part.19+0xc9f/0x1550
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.074121] perf_event_alloc+0x67/0x90
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.074130] __do_sys_perf_event_open+0x216/0x16f0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.074139] do_syscall_64+0x98/0x2d0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.074149] entry_SYSCALL_64_after_hwframe+0x65/0xca
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.074151]
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.075730] Freed by task 1416068:
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.079214] __kasan_slab_free+0x130/0x180
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.079220] kfree+0xa5/0x1e0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.079225] perf_remove_from_context+0xa3/0x1d0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.079229] perf_event_release_kernel+0xfe/0x620
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.079234] perf_release+0x21/0x30
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.079239] __fput+0x180/0x430
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.079245] task_work_run+0x101/0x140
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.079250] exit_to_usermode_loop+0x157/0x160
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.079256] do_syscall_64+0x28e/0x2d0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.079262] entry_SYSCALL_64_after_hwframe+0x65/0xca
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.079263]
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.080835] The buggy address belongs to the object at ffff88b1cefddc80#012 which belongs to the cache kmalloc-512 of size 512
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.093423] The buggy address is located 344 bytes inside of#012 512-byte region [ffff88b1cefddc80, ffff88b1cefdde80)
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.105231] The buggy address belongs to the page:
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.110100] page:ffffea00c73bf600 count:1 mapcount:0 mapping:ffff888100010c80 index:0x0 compound_mapcount: 0
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.120008] flags: 0x57ffffc0008100(slab|head)
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.124531] raw: 0057ffffc0008100 ffffea00caec7408 ffffea00ca4f7608 ffff888100010c80
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.132367] raw: 0000000000000000 0000000000330033 00000001ffffffff 0000000000000000
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.140201] page dumped because: kasan: bad access detected
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.145848]
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.147425] Memory state around the buggy address:
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.152305] ffff88b1cefddc80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.159621] ffff88b1cefddd00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.166927] >ffff88b1cefddd80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.174230] ^
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.180393] ffff88b1cefdde00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.187700] ffff88b1cefdde80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Apr 14 12:22:27 euleros-pxe kernel: [ 8072.195003] ==================================================================
Hi yangjihong2021, welcome to the openEuler Community.
I'm the Bot here serving you. You can find the instructions on how to interact with me at Here.
If you have any questions, please contact the SIG: Kernel, and any of the maintainers: @yangyingliang , @pi3orama , @gatieme , @jiaoff , @qiuuuuu , @zhengzengkai , @LiuYongQiang0816 , @xiexiuqi
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
以下为4.18内核问题的分析过程和结论,经排查4.19存在同样的问题:
1、内存申请的地方,通过
->__do_sys_perf_event_open
-> perf_event_alloc
-> perf_try_init_event
-> perf_uprobe_event_init
-> perf_uprobe_init
的调用关系申请了一个trace_event_call挂在perf_event上
2、内存释放的地方,通过
-> perf_remove_from_context
-> perf_event_release_kernel
-> perf_release
-> perf_event_release_kernel
-> perf_remove_from_context => 这里释放内存
-> put_event => 这里访问了之前释放的内存
释放内存的代码为:
static void perf_remove_from_context(struct perf_event *event, unsigned long flags)
{
......
/*
* This is as passable as any hw.target handling out there;
* hw.target implies task context, therefore, no migration.
* Which together with DETACH_GROUP means that this is the
* final remove_from_context of a task event.
*/
if (event->hw.target && (flags & DETACH_GROUP)) {
/*
* Now, the problem with, say uprobes, is that they
* use hw.target for context in their ->destroy()
* callbacks. Supposedly, they may need to poke at
* its contents, so better call it while we still
* have the task.
*/
if (event->destroy) {
event->destroy(event); -> 调用 destroy_local_trace_uprobe -> free_trace_uprobe,释放了tp_event内存
event->destroy = NULL;
}
put_task_struct(event->hw.target);
event->hw.target = NULL;
}
触发条件该perf_event attatch到某个任务上
访问该内存的代码为:
* Now that we hold ctx::mutex and child_mutex, revalidate our
* state, if child is still the first entry, it didn't get freed
* and we can continue doing so.
*/
tmp = list_first_entry_or_null(&event->child_list,
struct perf_event, child_list);
if (tmp == child) {
perf_remove_from_context(child, DETACH_GROUP);
list_move(&child->child_list, &free_list);
/*
* This matches the refcount bump in inherit_event();
* this can't be the last reference.
*/
put_event(event);
}
put_event调用链:
put_event
-> _free_event
-> perf_event_free_bpf_prog
-> perf_event_detach_bpf_prog
-> old_array = event->tp_event->prog_array
访问了event->tp_evnet,而这片内存在前面已经被释放掉了,触发了use_after_free
终上,该问题的触发条件为创建一个bpf prog,并attach到某一进程上,该进程后续再去fork子进程,子进程和父进程由于继承关系会引用同一个tp_event,父进程先执行perf_remove_from_context,由于错误地执行了perf_event->destory释放了内存,后续子进程事件释放时,由于会访问这里的内存,导致use_after_free
为commit 946d812a9b17f引入,对应问题单为DTS2019030511786, 该commit背景是为解决close和clone的条件竞争问题,合入了一个没有被社区接受的补丁,社区详细讨论见:https://lkml.org/lkml/2019/6/28/856
因为该commit引入了另一个use_after_free的问题,并没有被社区接受,如下:
即此时可能还有地方会用到这个event,这时候去调用event->destory是不安全
后续社区给了一个正式的解决方案:
在_free_event时只减引用计数,最终在perf_event_free_task时等待所有子进程不再引用该event了,再删除该ctx
4.19同4.18,同时合进了引入问题的commit(0380474221530)和最终解决问题的commit(eb41044bbece4),需要回退0380474221530
登录 后才可以发表评论