版本信息:
[root@localhost 127.0.0.1-2023-11-22-10:49:24]# uname -a
Linux localhost.localdomain 5.10.0 #1 SMP Mon Nov 20 21:42:47 CST 2023 aarch64 aarch64 aarch64 GNU/Linux
目前安装过4个版本的openv Switch,启动均会导致主机挂死:
1.基于openEuler22.03SP1 OLK内核编译的ovs的dpu-openEuler-20.03-LTS-SP1分支,且打了patch的版本
2.基于openEuler22.03SP1 OLK内核编译的ovs的dpu-openEuler-20.03-LTS-SP1分支,不打patch的版本,确认不是patch导致
3.基于openEuler22.03SP1 OLK内核编译的ovs的openEuler-22.03-LTS-SP3分支,发现最新版本启动也有问题
4.直接通过openEuler22.03SP1的源yum install openvswitch安装,发现原生版本也有启动问题
openv Switch启动会导致主机挂死,以下是/var/crash/下保存的日志:
[46311.270509] capability: warning: `yum' uses 32-bit capabilities (legacy support in use)
[46433.638742] systemd-rc-local-generator[73122]: /etc/rc.d/rc.local is not marked executable, skipping.
[46448.504124] Internal error: Oops - Undefined instruction: 0000000002000000 [#1] SMP
[46448.512493] Modules linked in: binfmt_misc virtio_net net_failover failover virtio_pci virtio_pci_modern_dev virtio_ring virtio rfkill sch_fq_codel rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_core_mod ib_iser rdma_cm iw_cm ib_cm libiscsi scsi_transport_iscsi ipmi_ssif hibmc_drm drm_vram_helper hns_roce_hw_v2 drm_ttm_helper acpi_ipmi ttm ib_uverbs ipmi_si ib_core hisi_uncore_ddrc_pmu hisi_uncore_l3c_pmu hisi_uncore_hha_pmu sg ipmi_devintf ipmi_msghandler hisi_uncore_pmu vfat fat hisdk3(OE) hiovs3(OE) hinic3(OE) hiudk3(OE) fuse ext4 mbcache jbd2 sd_mod t10_pi realtek hclge hisi_sas_v3_hw hisi_sas_main libsas ahci ghash_ce hinic libahci scsi_transport_sas sha2_ce sha256_arm64 hns3 sha1_ce libata sbsa_gwdt nfit libnvdimm host_edma_drv hnae3 i2c_designware_platform i2c_designware_core dm_mirror dm_region_hash dm_log dm_mod aes_neon_bs aes_neon_blk aes_ce_blk crypto_simd cryptd aes_ce_cipher
[46448.593479] CPU: 6 PID: 73228 Comm: ovsdb-server Kdump: loaded Tainted: G OE 5.10.0 #1
[46448.603301] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD, BIOS 1.86 01/10/2022
[46448.612170] pstate: a0400089 (NzCv daIf +PAN -UAO -TCO BTYPE=--)
[46448.618876] pc : armv8pmu_branch_reset+0xc/0x20
[46448.624099] lr : armpmu_add+0x64/0x120
[46448.628538] sp : ffff80003e0238c0
[46448.632544] x29: ffff80003e0238c0 x28: 0000000000000000
[46448.638997] x27: ffffae1293eed370 x26: ffff3c598e1f6800
[46448.645436] x25: 00000000fffffff5 x24: ffff1c5980696800
[46448.651868] x23: ffff1c5980696800 x22: ffff3c598e1f6800
[46448.658286] x21: 0000000000000001 x20: ffff1c4a0eb5bb00
[46448.664694] x19: ffff1c597fc83100 x18: 0000000000000000
[46448.671096] x17: 0000000000000000 x16: 0000000000000000
[46448.677487] x15: 0000000000000000 x14: 0000000000000000
[46448.683861] x13: 0000000000000000 x12: 0000000000000000
[46448.690223] x11: 00000000ffffffff x10: 00000000ffffffff
[46448.696574] x9 : ffffae12946e4a94 x8 : 0000000000000006
[46448.702913] x7 : ffff1c4a0eb5bb50 x6 : 0000000000000001
[46448.709243] x5 : 0000000000000003 x4 : ffff6e46ea973000
[46448.715559] x3 : ffff80003e0238c0 x2 : ffff6e46ea973000
[46448.721867] x1 : 0000000000000000 x0 : ffffae12946edfa0
[46448.728168] Call trace:
[46448.731603] armv8pmu_branch_reset+0xc/0x20
[46448.736763] event_sched_in+0xc8/0x1a0
[46448.741474] merge_sched_in+0x16c/0x41c
[46448.746259] visit_groups_merge.constprop.0.isra.0+0x19c/0x460
[46448.753027] ctx_sched_in+0x16c/0x180
[46448.757628] perf_event_sched_in+0x70/0xb0
[46448.762654] ctx_resched+0x64/0xb0
[46448.766973] __perf_event_enable+0x240/0x2f4
[46448.772150] event_function+0x80/0xf0
[46448.776716] remote_function+0x64/0x80
[46448.781361] generic_exec_single+0x100/0x170
[46448.786516] smp_call_function_single+0x150/0x19c
[46448.792106] event_function_call+0xb0/0x25c
[46448.797178] _perf_event_enable+0x9c/0x16c
[46448.802165] perf_event_for_each_child+0x3c/0x90
[46448.807671] _perf_ioctl+0x1ec/0x500
[46448.812142] perf_ioctl+0x4c/0x80
[46448.816352] __arm64_sys_ioctl+0xb0/0xf4
[46448.821167] invoke_syscall+0x50/0x11c
[46448.825807] el0_svc_common.constprop.0+0x158/0x164
[46448.831575] do_el0_svc+0x2c/0xac
[46448.835796] el0_svc+0x20/0x30
[46448.839758] el0_sync_handler+0xb0/0xb4
[46448.844499] el0_sync+0x160/0x180
[46448.848717] Code: d503201f aa1e03e9 d503201f d503233f (d509729f)
[46448.855760] SMP: stopping secondary CPUs
[46448.861877] Starting crashdump kernel...
[46448.866727] Bye!
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
1.问题根因是RBRE命令在OLK-5.10上不支持,拉起会导致挂死。
2.验证屏蔽CONFIG_BRBE宏,是否能解决挂死问题。
3.已只会提交补丁的作者解决问题。
!3069: drivers: perf: Not enabled ARM64_BRBE by default
已合入修改,本地验证OK,请check。
登录 后才可以发表评论