【环境信息】
环境信息:arm物理机
OS版本:20.03-SP2-round4
内核:4.19.90-2105.6.0.0090.oe1.aarch64
【问题复现步骤】
1.使用ISO选中最小化模式安装
2.执行长稳用例
【预期结果】
执行无异常
【实际结果】
执行长稳用例一段时间后机器挂掉,产生core文件
[31077.812327] vcan: Virtual CAN interface driver
[31077.812329] vcan: enabled echo on driver level.
[31077.839576] LTP: starting dma_thread_diotest6 (dma_thread_diotest -a 3072)
[31077.852754] LTP: starting msgrcv07
[31077.872154] LTP: starting msgrcv08
[31077.872154] LTP: starting msgrcv08
[31077.905499] LTP: starting timer_settime02
[31077.913540] LTP: starting msgsnd01
[31077.922597] LTP: starting msgsnd02
[31077.930673] LTP: starting msgsnd05
[31077.955532] LTP: starting msgsnd06
[31077.976741] EINJ: Error INJection is initialized.
[31078.002799] LTP: starting semctl01
[31078.009272] LTP: starting membarrier01
[31078.002799] LTP: starting semctl01
[31078.009272] LTP: starting membarrier01
[31078.136464] Unable to handle kernel NULL pointer dereference at virtual address 000000000000002c
[31078.151187] Mem abort info:
[31078.156945] ESR = 0x96000007
[31078.162892] Exception class = DABT (current EL), IL = 32 bits
[31078.171859] SET = 0, FnV = 0
[31078.177885] EA = 0, S1PTW = 0
[31078.183930] Data abort info:
[31078.189730] ISV = 0, ISS = 0x00000007
[31078.196529] CM = 0, WnR = 0
[31078.202450] user pgtable: 64k pages, 48-bit VAs, pgdp = 0000000007cf6fb5
[31078.212388] [000000000000002c] pgd=0000005fceab0003, pud=0000005fceab0003, pmd=0000005f98410003, pte=0000000000000000
[31078.229818] Internal error: Oops: 96000007 [#1] SMP
[31078.238192] Modules linked in: vcan can_raw can authenc ccm usb_storage usbatm atm sit tunnel4 ip_tunnel nbd vhost_net tap uhid uinput vhost
vsock vmw_vsock_virtio_transport_common vhost vsock tun scsi_debug cuse nls_koi8_u nls_cp932 vfio_iommu_type1 vfio unix_diag hdma_mgmt ccp vet
h vrf sm4_generic cmac ansi_cprng vmac sha3_generic seed cts aes_ce_ccm fcrypt pcbc anubis khazad tea michael_mic cast5_generic blowfish_generi
c blowfish_common des_generic sctp ts_kmp nf_log_arp nf_log_ipv6 nf_log_ipv4 nf_log_common brd fuse overlay salsa20_generic camellia_generic ca
st6_generic cast_common serpent_generic twofish_generic twofish_common xts lrw tgr192 wp512 rmd320 rmd256 rmd160 rmd128 md4 sha512_generic aes
neon_blk loop jprob(OE) binfmt_misc ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4
[31078.364569] xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security
iptable_nat nf_nat_ipv4 nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ebtable_filter ebt
ables ip6table_filter ip6_tables iptable_filter vfat fat ipmi_ssif dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c raid1 ses e
nclosure aes_ce_blk crypto_simd cryptd aes_ce_cipher ghash_ce hns_roce_hw_v2 ofpart sha2_ce cmdlinepart sha256_arm64 hns_roce sha1_ce sg ib_cor
e sbsa_gwdt ipmi_si hi_sfc ipmi_devintf mtd spi_dw_mmio ipmi_msghandler sch_fq_codel ip_tables ext4 mbcache jbd2 sr_mod cdrom sd_mod realtek hi
si_sas_v3_hw hisi_sas_main libsas ahci hclge scsi_transport_sas libahci hns3 hinic libata hnae3
[31078.517251] megaraid_sas i2c_designware_platform i2c_designware_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: einj]
[31078.545923] Process membarrier01 (pid: 3995063, stack limit = 0x00000000c5abde19)
[31078.570778] CPU: 35 PID: 3995063 Comm: membarrier01 Kdump: loaded Tainted: G WC OE 4.19.90-2105.6.0.0090.oe1.aarch64 #1
[31078.600187] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDDA, BIOS 1.06 10/29/2019
[31078.605325] LTP: starting semctl02
[31078.626327] pstate: 60400009 (nZCv daif +PAN -UAO)
[31078.626334] pc : membarrier_global_expedited+0xe8/0x1a8
[31078.626337] lr : membarrier_global_expedited+0xe8/0x1a8
[31078.678926] sp : ffff00016ca6fd50
[31078.690382] x29: ffff00016ca6fd50 x28: ffff802d6005f000
[31078.703582] x27: 0000000000000000 x26: 0000000000000000
[31078.716619] x25: ffff00016ca6fda8 x24: ffff000008fcd500
[31078.729474] x23: ffff0000092d5000 x22: ffff000008fb0018
[31078.742120] x21: ffff0000092d3000 x20: ffff0000092d57f0
[31078.754541] x19: 0000000000000020 x18: 0000000000000000
[31078.757624] LTP: starting semctl03
[31078.766910] x17: 0000000000000000 x16: 0000000000000000
[31078.766912] x15: 0000000000000000 x14: 0000000000000000
[31078.766913] x13: 0000000000000000 x12: 0000000000000000
[31078.766914] x11: 0000000000000000 x10: 0000000000000000
[31078.766915] x9 : 0000000000000000 x8 : 0000000000000000
[31078.766915] x7 : 0000000000000000 x6 : ffff00016ca6fd48
[31078.766918] x5 : ffff00016ca6fd48 x4 : ffffffffffffffff
[31078.839878] LTP: starting semctl04
[31078.845022] x3 : 0000000000000000 x2 : 62512f9d263e2b00
[31078.845024] x1 : 0000000000000000 x0 : 0000000000000000
[31078.845026] Call trace:
[31078.845028] membarrier_global_expedited+0xe8/0x1a8
[31078.845031] __arm64_sys_membarrier+0xac/0x1f0
[31078.882494] LTP: starting semctl05
[31078.884100] el0_svc_common+0x78/0x178
[31078.901063] LTP: starting semctl06
[31078.909058] el0_svc_handler+0x38/0x78
[31078.909060] el0_svc+0x8/0x1f8
[31078.909063] Code: b949d401 361ffdc1 91264000 97fe707d (b9402c00)
[31078.955012] SMP: stopping secondary CPUs
[31078.964813] Starting crashdump kernel...
[31078.972460] Bye!
Hey Classicriver_jia, Welcome to openEuler Community.
All of the projects in openEuler Community are maintained by @openeuler-ci-bot.
That means the developers can comment below every pull request or issue to trigger Bot Commands.
Please follow instructions at https://gitee.com/openeuler/community/blob/master/en/sig-infrastructure/command.md to find the details.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
crash /usr/lib/debug/usr/lib/modules/4.19.90-2105.6.0.0090.oe1.aarch64/vmlinux ./vmcore
dis -x membarrier_global_expedited
0xffff000008160dc8 <membarrier_global_expedited+0xc0>: b.eq 0xffff000008160d9c <membarrier_global_expedited+0x94>
0xffff000008160dcc <membarrier_global_expedited+0xc4>: adrp x0, 0xffff0000092d3000 <page_wait_table+0x14c0>
0xffff000008160dd0 <membarrier_global_expedited+0xc8>: add x0, x0, #0x7c8
0xffff000008160dd4 <membarrier_global_expedited+0xcc>: mov x1, x24
0xffff000008160dd8 <membarrier_global_expedited+0xd0>: ldr x0, [x0,w19,sxtw #3]
0xffff000008160ddc <membarrier_global_expedited+0xd4>: add x0, x0, x1
0xffff000008160de0 <membarrier_global_expedited+0xd8>: ldr w1, [x0,#2516]
0xffff000008160de4 <membarrier_global_expedited+0xdc>: tbz w1, #3, 0xffff000008160d9c <membarrier_global_expedited+0x94>
0xffff000008160de8 <membarrier_global_expedited+0xe0>: add x0, x0, #0x990
0xffff000008160dec <membarrier_global_expedited+0xe4>: bl 0xffff0000080fcfe0 <task_rcu_dereference>
0xffff000008160df0 <membarrier_global_expedited+0xe8>: ldr w0, [x0,#44]
0xffff000008160df4 <membarrier_global_expedited+0xec>: tbnz w0, #21, 0xffff000008160d9c <membarrier_global_expedited+0x94>
0xffff000008160df8 <membarrier_global_expedited+0xf0>: cmp w19, #0x0
0xffff000008160dfc <membarrier_global_expedited+0xf4>: add w0, w19, #0x3f
0xffff000008160e00 <membarrier_global_expedited+0xf8>: csel w0, w0, w19, lt
0xffff000008160e04 <membarrier_global_expedited+0xfc>: negs w2, w19
membarrier_global_expedited 中获取到的 p 为 NULL,导致直接访问 p->flags 的时候出现了 NULL 指针。
最早出现的问题
https://lore.kernel.org/patchwork/patch/508120/
https://lore.kernel.org/patchwork/patch/508946/
https://lore.kernel.org/patchwork/patch/508980/
https://lore.kernel.org/patchwork/patch/509526/
bac7857319bc sched/fair: Use task_rcu_dereference()
150593bf8693 sched/api: Introduce task_rcu_dereference() and try_get_task_struct()
鉴于此问题, redhat 14 年推出的补丁. introduce task_rcu_dereference?
https://lkml.org/lkml/2014/10/22/833
https://lore.kernel.org/patchwork/cover/510962/
如果一个进程不能在 rcu_read_unlock 之前被释放的时候, 他才会返回非 NULL.
否则返回 NULL, 意味着这个进程已经释放或者正在释放的过程中, 且在 unlock 之前就会被释放.
这个补丁一直到 2016 年才合入, 然后将上面的补丁修改为 raw_spin_lock_irq 对 rq->curr 的保护修改为 task_rcu_dereference 的保护
随后补丁 commit 227a4aadc75b sched/membarrier: Fix p->mm->membarrier_state racy load
删除了 p 的判断. 这里有问题, 应该对 p 进行判 NULL.
接着 2019/08/20 一个 BUG, 直指 task_rcu_dereference() 函数有一定问题.
https://lkml.org/lkml/2019/8/30/574
合入了如下补丁, 保证了进程在离开 RQ 后或者退出后, 经历一个宽限期才可以被正常释放掉, 于是我们可以保证在安全的使用 rcu_dereference(), 而不是使用 task_rcu_dereference()
https://lore.kernel.org/patchwork/patch/1127742/
5311a98fef7d tasks, sched/core: RCUify the assignment of rq->curr
154abafc68bf tasks, sched/core: With a grace period after finish_task_switch(), remove unnecessary code
0ff7b2cfbae3 tasks, sched/core: Ensure tasks are available for a grace period after leaving the runqueue
3fbd7ee285b2 tasks: Add a count of task RCU users
至此, 我们可以安全的使用 rcu_dereference, 并且不再需要判 NULL 操作.
membarrier_global_expedited 函数引入的时候, p 是进行了判 NULL 的.
合入了 08946eccabb9 sched/membarrier: Fix p->mm->membarrier_state racy load.
删除了原来代码对 p 的判 NULL.
08946eccabb9 sched/membarrier: Fix p->mm->membarrier_state racy load
cfd49aa06b94 sched: Clean up active_mm reference counting
987805770a3f sched/membarrier: Remove redundant check
因此引入了问题.
mainline 无此问题的原因是,
mainline 是先合入了 2.4 的补丁, 此时使用 rcu_dereference 获取 rq->curr 不会为 NULL. 因此再合入 3.2 的补丁, 不会有问题.
而 openEuler 的版本, 先合入了 3.2 的补丁, 删除了对 p 的判 NULL, 但是此时由于使用的是 task_rcu_dereference, 因此 p 可能为 NULL.
修复方案:
可以对 p 增加判 NULL 操作, 该操作可以规避当前 issue 描述的问题, 但是会有其他问题, 导致 siginfo 未定义, 参见 https://lkml.org/lkml/2019/8/30/574
合入 task rcu users 的补丁, 保证即将退出的进程在出队后, 经过一个宽限期才能被释放掉. 这样我们可以使用rcu_dereference() 替换 task_rcu_dereference(), 这样就不需要再判 NULL 了.
2021/06/16 18:00 CCB 结论如下:
登录 后才可以发表评论