【标题描述】能够简要描述问题:说明什么场景下,做了什么操作,出现什么问题(尽量使用正向表达方式)
不能正常使能pmu based nmi_watchdog
【环境信息】
硬件信息:
1) 裸机场景提供出问题的硬件信息;
2) 虚机场景提供虚机XML文件或者配置信息
软件信息:
1) OS版本及分支
2) 内核信息
3) 发现问题的组件版本信息
如果有特殊组网,请提供网络拓扑图
【问题复现步骤】
具体操作步骤
出现概率(是否必现,概率性错误)
【预期结果】
描述预期结果,可以通过对比新老版本获取
【实际结果】
描述出问题的结果
【附件信息】
比如系统message日志/组件日志、dump信息、图片等
Hi stkid, welcome to the openEuler Community.
I'm the Bot here serving you. You can find the instructions on how to interact with me at
https://gitee.com/openeuler/community/blob/master/en/sig-infrastructure/command.md.
If you have any questions, please contact the SIG: Kernel, and any of the maintainers: @Xie XiuQi, @YangYingliang, @成坚 (CHENG Jian).
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
60565144df0a ("init: only move down lockup_detector_init() when sdei_watchdog is enabled")
is to fix a 'BUG'. While on ARM64, armv8_pmu_driver_init() is called in do_basic_setup(), it will
fail to create perf event if lockup_detector_init() is moved back. So revert the patch firstly.
Then let's fix the original issue.
When enabling CONFIG_DEBUG_PREEMPT and CONFIG_PREEMPT, it triggers a 'BUG'
in the pmu based nmi_watchdog initializaion:
[ 3.341853] BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
[ 3.344392] caller is debug_smp_processor_id+0x17/0x20
[ 3.344395] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.10.0+ #398
[ 3.344397] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[ 3.344399] Call Trace:
[ 3.344410] dump_stack+0x60/0x76
[ 3.344412] check_preemption_disabled+0xba/0xc0
[ 3.344415] debug_smp_processor_id+0x17/0x20
[ 3.344422] hardlockup_detector_event_create+0xf/0x60
[ 3.344427] hardlockup_detector_perf_init+0xf/0x41
[ 3.344430] watchdog_nmi_probe+0xe/0x10
[ 3.344432] lockup_detector_init+0x22/0x5b
[ 3.344437] kernel_init_freeable+0x20c/0x245
[ 3.344439] ? rest_init+0xd0/0xd0
[ 3.344441] kernel_init+0xe/0x110
[ 3.344446] ret_from_fork+0x22/0x30
This issue was introduced by commit a79050434b45, which move down
lockup_detector_init() after do_basic_setup(), after sched_init_smp() too.
hardlockup_detector_event_create
|- hardlockup_detector_perf_init (unsafe)
|- watchdog_nmi_probe
|- lockup_detector_init
|- hardlockup_detector_perf_enable
|- watchdog_nmi_enable
|- watchdog_enable
|- lockup_detector_online_cpu
|- softlockup_start_fn
|- softlockup_start_all
|- lockup_detector_reconfigure
|- lockup_detector_setup
|- lockup_detector_init
After analysing the calling context, it's only unsafe to use
smp_processor_id() in hardlockup_detector_perf_init() as the thread
'kernel_init' is preemptible after sched_init_smp().
While it is just a test if we can enable the pmu based nmi_watchdog, the
real enabling process is in softlockup_start_fn() later which ensures
that watchdog_enable() is called on all cores. So it's free to disable
preempt to fix this 'BUG'.
登录 后才可以发表评论