10 Star 4 Fork 30

src-openEuler / gazelle

 / 详情

【22.03SP2】【偶现】客户端使用gazelle长时间打流发生coredump(poll_rpc_msg)

已完成
缺陷
创建于  
2023-11-22 11:09

【环境信息】
openeulerversion=openEuler-22.03-LTS-SP2
compiletime=2023-06-29-19-26-48
gccversion=10.3.1-37.oe2203sp2
kernelversion=5.10.0-153.12.0.92.oe2203sp2
openjdkversion=1.8.0.372.b07-1.oe2203sp2
[root@openEuler ~]# rpm -q gazelle
gazelle-1.0.2-15.aarch64
[root@openEuler ~]# rpm -q dpdk
dpdk-21.11-50.oe2203sp2.aarch64

【问题复现步骤】,请描述具体的操作步骤
服务端启动内核态 ./benchmark_ker -sMode dn -pSize 0 -mSize 4096 -pdSize 2 -cTimes 0 -uSocket 0 -mSq 0 -pol 0 -md5Check 0

[root@openEuler gazelle]# cat config.ini
[MicroBenchmark]
#publicd
DebugMode=0
TestMode=1 #1 ONLY_TX  2 BOTH_TX_RX
MsgHeadLen=30

#cn/dn
#CnHostName=124.88.97.87
CnHostName=124.88.97.88
CnPort=3113

#CnHostName=124.88.97.87
Dn1HostName=192.168.133.169
Dn2HostName=192.168.133.169
Dn1Port=41111
Dn2Port=51111
ThreadPoolSize=30
#client
ThreadNums=30
ReportDuring=15

客户端启动启动用户态 ./benchmark_usr -sMode client -mSize 4096 -tNums 3 -cNums 30 --flow_mode high -uSocket 0 -md5Check 0 -cNb 1

[root@openEuler gazelle]# cat config.ini
[MicroBenchmark]
#publicd
DebugMode=0
TestMode=1 #1 ONLY_TX  2 BOTH_TX_RX
MsgHeadLen=30

#cn/dn
#CnHostName=124.88.97.87
CnHostName=124.88.97.87
CnPort=3114

#CnHostName=124.88.97.87
Dn1HostName=192.168.133.169
Dn2HostName=192.168.133.169
Dn1Port=41111
Dn2Port=51111
ThreadPoolSize=30
#client
ThreadNums=30
ReportDuring=15

dpdk_args=["--socket-mem", "2400,0,0,0", "--huge-dir", "/mnt/hugepages-lstack", "--proc-type", "primary", "--legacy-mem", "--map-perfect","--file-prefix","lstack_file2","-a","0000:0c:00.0"]
use_ltran=0
kni_switch=0
low_power_mode=0
num_cpus="2-4"
#num_wakeup="2-4"
app_bind_numa=1
host_addr="124.88.70.176"
mask_addr="255.255.0.0"
gateway_addr="124.88.0.1"
devices="24:a5:2c:d1:ed:4f"

send_connect_number=8
read_connect_number=8
rpc_number=8
nic_read_number=128
tcp_conn_count=1500
mbuf_count_per_conn=505
tack_thread_mode="run-to-completion" 
unix_prefix="02"

【实际结果】,请描述出问题的结果和影响
无法正常建连,最后发生coredump
输入图片说明

Core was generated by `./benchmark_usr -sMode client -mSize 4096 -tNums 3 -cNums 30 --flow_mode high -'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000ffffb17f3788 in do_memp_malloc_pool (desc=0xffffa7ff8f70) at core/memp.c:272
272	core/memp.c: No such file or directory.
[Current thread is 1 (Thread 0xffffa7ff7ea0 (LWP 775188))]
(gdb) bt
#0  0x0000ffffb17f3788 in do_memp_malloc_pool (desc=0xffffa7ff8f70) at core/memp.c:272
#1  0x0000ffffb17f5488 in memp_malloc (type=type@entry=MEMP_TCP_PCB) at core/memp.c:354
#2  0x0000ffffb17f96d0 in tcp_alloc (prio=prio@entry=64 '@') at core/tcp.c:2043
#3  0x0000ffffb17f99e8 in tcp_new_ip_type (type=0 '\000') at core/tcp.c:2176
#4  0x0000ffffb17ec448 in pcb_new (msg=0xffffa7ff7428) at api/api_msg.c:685
#5  lwip_netconn_do_newconn (m=m@entry=0xffffa7ff7428) at api/api_msg.c:714
#6  0x0000ffffb17f18f4 in tcpip_send_msg_wait_sem (fn=0xffffb17ec3a0 <lwip_netconn_do_newconn>, apimsg=apimsg@entry=0xffffa7ff7428, 
    sem=sem@entry=0x119dd02c0) at api/tcpip.c:459
#7  0x0000ffffb17ea5b8 in netconn_apimsg (apimsg=0xffffa7ff7428, fn=<optimized out>) at api/api_lib.c:131
#8  netconn_new_with_proto_and_callback (t=<optimized out>, proto=proto@entry=0 '\000', callback=<optimized out>) at api/api_lib.c:161
#9  0x0000ffffb17efa88 in lwip_socket (domain=<optimized out>, type=1, protocol=protocol@entry=0) at api/sockets.c:1930
#10 0x0000ffffb1817f74 in do_lwip_socket (domain=<optimized out>, type=<optimized out>, protocol=<optimized out>)
    at core/lstack_lwip.c:1192
#11 0x0000ffffb181a670 in stack_socket (msg=0x109a78600) at core/lstack_protocol_stack.c:687
#12 0x0000ffffb181bfbc in poll_rpc_msg (stack=stack@entry=0xffffa0000b70, max_num=7) at core/lstack_thread_rpc.c:107
#13 0x0000ffffb1819ea0 in stack_polling (wakeup_tick=wakeup_tick@entry=120097671) at core/lstack_protocol_stack.c:448
#14 0x0000ffffb181a020 in gazelle_stack_thread (arg=<optimized out>) at core/lstack_protocol_stack.c:513
#15 0x0000ffffb15d1630 in start_thread (arg=0x0) at pthread_create.c:443
#16 0x0000ffffb1637b9c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:79
(gdb) info thread
  Id   Target Id                          Frame 
* 1    Thread 0xffffa7ff7ea0 (LWP 775188) 0x0000ffffb17f3788 in do_memp_malloc_pool (desc=0xffffa7ff8f70) at core/memp.c:272
  2    Thread 0xffffb0152ea0 (LWP 775180) 0x0000ffffb1637d3c in __GI_epoll_pwait (epfd=<optimized out>, events=0xffffb0152548, 
    maxevents=1, timeout=-1, set=0x0) at ../sysdeps/unix/sysv/linux/epoll_pwait.c:40
  3    Thread 0xffffac8e2ea0 (LWP 775187) 0x0000ffffb17ebb34 in lwip_netconn_do_writemore (conn=0x10ee25df8, 
    delayed=delayed@entry=0 '\000') at api/api_msg.c:1801
  4    Thread 0xffffaf132ea0 (LWP 775182) 0x0000ffffb1639ac4 in __recvmsg_syscall (flags=0, msg=0xffffaf132218, fd=<optimized out>)
    at ../sysdeps/unix/sysv/linux/recvmsg.c:27
  5    Thread 0xffffb015b020 (LWP 775179) 0x0000ffffb1603dd4 in __GI___clock_nanosleep (clock_id=<optimized out>, clock_id@entry=0, 
    flags=flags@entry=0, req=req@entry=0xffffc938d098, rem=rem@entry=0xffffc938d098) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
  6    Thread 0xffffaf942ea0 (LWP 775181) 0x0000ffffb1637d3c in __GI_epoll_pwait (epfd=<optimized out>, events=0xffffaf9424e0, 
    maxevents=3, timeout=-1, set=0x0) at ../sysdeps/unix/sysv/linux/epoll_pwait.c:40
  7    Thread 0xffffa77e7ea0 (LWP 775189) 0x0000ffffb17f18b8 in tcpip_send_msg_wait_sem (fn=0xffffb17ecf70 <lwip_netconn_do_write>, 
    apimsg=apimsg@entry=0xffffa77e73a8, sem=0x119bf7450) at api/tcpip.c:456
  8    Thread 0xffff8fff7ea0 (LWP 775194) __pthread_spin_lock (lock=lock@entry=0x109a78600) at pthread_spin_lock.c:71
  9    Thread 0xffffa5fb7ea0 (LWP 775192) 0x0000ffffb1637d3c in __GI_epoll_pwait (epfd=<optimized out>, events=0xffff98000d78, 
    maxevents=512, timeout=-1, set=0x0) at ../sysdeps/unix/sysv/linux/epoll_pwait.c:40
  10   Thread 0xffffa67c7ea0 (LWP 775191) 0x0000ffffb1637d3c in __GI_epoll_pwait (epfd=<optimized out>, events=0xffffa0000d78, 
    maxevents=512, timeout=-1, set=0x0) at ../sysdeps/unix/sysv/linux/epoll_pwait.c:40
  11   Thread 0xffffa6fd7ea0 (LWP 775190) 0x0000ffffb1637d3c in __GI_epoll_pwait (epfd=<optimized out>, events=0xffffa8000d78, 
    maxevents=512, timeout=-1, set=0x0) at ../sysdeps/unix/sysv/linux/epoll_pwait.c:40
  12   Thread 0xffffa4a39ea0 (LWP 775193) futex_wait (private=0, expected=2, futex_word=0xffff88001c30)
    at ../sysdeps/nptl/futex-internal.h:146
  13   Thread 0xffff8f7e7ea0 (LWP 775195) futex_wait (private=0, expected=2, futex_word=0xffff84001c30)
    at ../sysdeps/nptl/futex-internal.h:146
  14   Thread 0xffffae922ea0 (LWP 775183) 0x0000ffffb1629b40 in __GI___libc_read (nbytes=1, buf=0xffffae9224c7, fd=<optimized out>)
    at ../sysdeps/unix/sysv/linux/read.c:26
  15   Thread 0xffffad902ea0 (LWP 775185) 0x0000ffffb1639600 in __libc_accept (fd=<optimized out>, addr=..., len=0x0)
    at ../sysdeps/unix/sysv/linux/accept.c:26
  16   Thread 0xffffad0f2ea0 (LWP 775186) 0x0000ffffb1639600 in __libc_accept (fd=<optimized out>, addr=..., len=0xffffad0f144c)
    at ../sysdeps/unix/sysv/linux/accept.c:26
  17   Thread 0xffffae112ea0 (LWP 775184) 0x0000ffffb1629b40 in __GI___libc_read (nbytes=1, buf=0xffffae1124c7, fd=<optimized out>)
    at ../sysdeps/unix/sysv/linux/read.c:26
(gdb) 

评论 (2)

chenshijuan3 创建了缺陷

Hi chenshijuan3, welcome to the openEuler Community.
I'm the Bot here serving you. You can find the instructions on how to interact with me at Here.
If you have any questions, please contact the SIG: sig-high-performance-network, and any of the maintainers: @L.X. , @LemmyHuang , @sky , @李扬扬 , @吴昌盛 , @jinag12 , @lilijun , @李辉松 , @kircher

openeuler-ci-bot 添加了
 
sig/sig-high-perform
标签
chenshijuan3 修改了标题
chenshijuan3 修改了描述
chenshijuan3 修改了描述
chenshijuan3 修改了描述
jinag12 修改了标题
jinag12 修改了标题

以下PR修复,在关闭fd之后还存在发送数据的问题,导致UAF问题,新版本连续跑了一晚上并未出现coredump
https://gitee.com/openeuler/gazelle/pulls/435

jinag12 任务状态待办的 修改为已完成

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(3)
5329419 openeuler ci bot 1632792936
1
https://gitee.com/src-openeuler/gazelle.git
git@gitee.com:src-openeuler/gazelle.git
src-openeuler
gazelle
gazelle

搜索帮助

53164aa7 5694891 3bd8fe86 5694891