23 Star 93 Fork 77

openEuler / gazelle

 / 详情

启动报错 gazelle_network_init:226 init_protocol_stack failed

已完成
缺陷
创建于  
2022-05-21 15:58

【标题描述】
在openEuler安装gazelle,使用gazelle启动netperf报错:init_protocol_stack failed

【环境信息】
硬件信息:
CPU:2*Kunpeng 920 5231K
网卡1:TM210 4*GE
网卡2:SP570 4*25GE
软件信息:
1)openEuler 22.03 LTS
2)Linux localhost.localdomain 5.10.0-60.9.0.40.oe1.aarch64
3)gazelle-1.0.1-2.oe1.aarch64
4)双网卡,一个万兆一个千兆。千兆网卡配置了网桥

【问题复现步骤】
具体操作步骤

yum install dpdk
yum install libconfig
yum install numactl
yum install libboundscheck
yum install libpcap
yum install gazelle

modprobe vfio-pci
dpdk-devbind -b vfio-pci enp131s0

echo 2000 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
echo 0 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages
echo 0 > /sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages
echo 0 > /sys/devices/system/node/node3/hugepages/hugepages-2048kB/nr_hugepages

mkdir -p /mnt/hugepages
mkdir -p /mnt/hugepages-2M
chmod -R 700 /mnt/hugepages
chmod -R 700 /mnt/hugepages-2M
mount -t hugetlbfs nodev /mnt/hugepages
mount -t hugetlbfs nodev /mnt/hugepages-2M

GAZELLE_BIND_PROCNAME=netserver LD_PRELOAD=/usr/lib64/liblstack.so netserver -4 -L 192.168.1.131 -p 9999

配置文件:/etc/gazelle/lstack.conf

dpdk_args=["--socket-mem", "2480,0,0,0", "--huge-dir", "/mnt/hugepages-2M", "--proc-type", "primary", "--legacy-mem", "--map-perfect"]

use_ltran=0
kni_switch=0

low_power_mode=0

num_cpus="0,1,2,3,4,5,6,7"
num_weakup="8,9,10,11,12,13,14,15"

numa_bind=0

host_addr="192.168.1.131"
mask_addr="255.255.255.0"
gateway_addr="192.168.1.1"
devices="7c:1c:f1:4f:16:59"

出现概率(是否必现,概率性错误)
必现

【预期结果】
应用正常启动

【实际结果】
报错 init_protocol_stack failed

【附件信息】

使用Gazelle启动netserver报错,报错信息如下:

[root@localhost ~]# GAZELLE_BIND_PROCNAME=netserver LD_PRELOAD=/usr/lib64/liblstack.so netserver -4 -L 192.168.1.131 -p 9999
dpdk argv: --socket-mem 2048,0,0,0 --huge-dir /mnt/hugepages-2M --proc-type primary --legacy-mem --map-perfect
EAL: Detected CPU lcores: 64
EAL: Detected NUMA nodes: 4
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: No free 2048 kB hugepages reported on node 1
EAL: No free 2048 kB hugepages reported on node 2
EAL: No free 2048 kB hugepages reported on node 3
EAL: No available 32768 kB hugepages reported
EAL: No available 64 kB hugepages reported
EAL: 15 hugepages of size 1073741824 reserved, but no mounted hugetlbfs found for that size
EAL: VFIO support initialized
EAL: Using IOMMU type 1 (Type 1)
EAL: Probe PCI driver: net_hinic (19e5:1822) device: 0000:83:00.0 (socket 2)
EAL: Releasing PCI mapped resource for 0000:83:00.0
EAL: Calling pci_unmap_resource for 0000:83:00.0 at 0x180600000
EAL: Calling pci_unmap_resource for 0000:83:00.0 at 0x180620000
EAL: Calling pci_unmap_resource for 0000:83:00.0 at 0x180628000
EAL: Requested device 0000:83:00.0 cannot be used
TELEMETRY: Error with accept, telemetry thread quitting
TELEMETRY: No legacy callbacks, legacy socket not created
LSTACK: gazelle_network_init:212 create control_easy_thread success
LSTACK: ethdev_port_id:307 No NIC is matched
LSTACK: init_protocol_stack:212 dpdk_ethdev_init failed
EAL: Error - exiting with code: 1
  Cause: gazelle_network_init:226 init_protocol_stack failed

评论 (6)

TOTORO 创建了缺陷

Hi kui0112, welcome to the openEuler Community.
I'm the Bot here serving you. You can find the instructions on how to interact with me at Here.
If you have any questions, please contact the SIG: sig-high-performance-network, and any of the maintainers: @luzhihao , @L.X. , @LemmyHuang , @sky , @speech_white , @李扬扬 , @吴昌盛

TOTORO 修改了描述
TOTORO 修改了描述

最后失败在LSTACK: ethdev_port_id:307 No NIC is matched,没有找到匹配的用户态网卡

1,使用dpdk-devbind -s命令查看是否有DPDK驱动的网卡如下
输入图片说明

2,执行dpdk-devbind -b vfio-pci enp131s0是否有报错

首先感谢您的答复 :smile:

  1. 执行dpdk-devbind -s 确实能够看到Network devices using DPDK-campatible driver下有一个网卡
    输入图片说明

  2. 执行 dpdk-devbind -b vfio-pci enp131s0之后没有报错
    我重新绑定了enp131s0,仍然报同样的错,操作步骤如下:

# 从dpdk解绑
dpdk-devbind -u 0000:83:00.0
# 绑定到系统hinic驱动
lspci -ns 0000:83:00.0 |awk -F':| ' '{print 5" "6}' > /sys/bus/pci/drivers/hinic/new_id
# 重新绑定到dpdk
dpdk-devbind -b vfio-pci enp131s0

输入图片说明

同样的报错 :sob:
输入图片说明

我换成 igb_uio 再试试好了。

驱动换成igb_uio也是一样的报错。不清楚哪里出错了,启动ltran也会报 No NIC is matched

吴昌盛 任务状态待办的 修改为已完成
TOTORO 任务状态已完成 修改为修复中
TOTORO 任务状态修复中 修改为待办的
TOTORO 修改了描述
TOTORO 修改了描述

网卡在NUMA2上,内存和cpu也选择NUMA2的

[图片上传中…(image-XZ5sHw6Vpe0Yst6207vM)]

TOTORO 任务状态待办的 修改为已完成
TOTORO 任务状态已完成 修改为已确认

问题排查思路:

  1. 确认网卡与哪个CPU连接,假如是 CPU2
  2. lstack.conf内绑核配置需要绑定CPU2的核
  3. lstack.conf内NUMA配置需要绑定CPU2的NUMA
  4. 大页需要配置到CPU2的NUMA
吴昌盛 任务状态已确认 修改为已完成

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(3)
5329419 openeuler ci bot 1632792936 4906557 kui0112 1590503562
1
https://gitee.com/openeuler/gazelle.git
git@gitee.com:openeuler/gazelle.git
openeuler
gazelle
gazelle

搜索帮助