401 Star 1.4K Fork 1.3K

GVPopenEuler / kernel

 / 详情

接SAS盘开关PHY测试写IO偶现单个IO芯片报回两次CQ问题

已完成
缺陷
创建于  
2022-11-24 18:48

【标题描述】能够简要描述问题:接SAS盘开关PHY测试写IO偶现单个IO芯片回两次CQ问题
【环境信息】
硬件信息:
1) 鲲鹏920
软件信息:
[root@localhost 0000:b4:02.0]# cat /etc/euleros-latest
eulerversion=EulerOS_Server_V200R008C00SPC300B630
compiletime=2019-12-27-10-58-38
kernelversion=4.19.36-vhulk1907.1.0.h619
[root@localhost 0000:b4:02.0]# uname -a
Linux localhost.localdomain 4.19.36-vhulk1907.1.0.h619.eulerosv2r8.aarch64 #1 SMP Mon Jul 22 00:00:00 UTC 2019 aarch64 aarch64 aarch64 GNU/Linux
【问题复现步骤】
具体操作步骤: 对sas盘下发写IO过程时,反复对盘进行拔插
出现概率:10%
【预期结果】
IO下发和IO返回一一对应,一个IO只有一个CQ返回。
【实际结果】
一个IO有前后有两个CQ返回。此异常在当前驱动处理流程中,因第一次CQ异常不会释放资源,或释放资源前会下发abort命令,不会引发第二次硬件写CQ时踩内存问题,但不符合常规。
【解决方案】
驱动收到任意异常CQ时,将此CQ对应的IPTT置为aborted状态,SAS IP的设计上,只要aborted状态置位,在IPTT重用之前,可以挡住盘第二次会的response的帧。
【附件信息】
日志:
iptt 123的IO,收到第一次报异常CQ:
[80970.717501] hisi_sas_v3_hw 0000:74:02.0: phydown: phy1 phy_state=0xf9
[80970.717505] hisi_sas_v3_hw 0000:74:02.0: erroneous completion iptt=123 task=000000003d169657 dev id=845 sas_addr=0x5000cca0708597e1 CQ hdr: 0x203 0x34d007b 0x0 0x20400 Error info: 0x0 0x0 0x0 0x0
[80970.717507] hisi_sas_v3_hw 0000:74:02.0: Enter sas_task_abort().
[80970.717510] hisi_sas_v3_hw 0000:74:02.0: device id 845, IPTT 123 is complete
[80971.737888] hisi_sas_v3_hw 0000:74:02.0: phyup: phy1 link_rate=10
[80973.737626] hisi_sas_v3_hw 0000:74:02.0: slot complete: port 1 has removed
[80973.737632] hisi_sas_v3_hw 0000:74:02.0: slot complete: port 1 has removed
[80973.737646] hisi_sas_v3_hw 0000:74:02.0: slot complete: port 1 has removed
[80973.737649] scsi_io_completion_action: 53 callbacks suppressed

iptt 123的IO,收到了第二次报CQ:
[80973.737860] hisi_sas_v3_hw 0000:74:02.0: erroneous completion iptt=123 task=000000003d169657 dev id=845 sas_addr=0x5000cca0708597e1 CQ hdr: 0x1503 0x34d007b 0x0 0x20000 Error info: 0x11800 0x0 0x0 0x40
[80973.737861] scsi 6:0:845:0: rejecting I/O to dead device
[80973.737861] scsi 6:0:845:0: rejecting I/O to dead device
[80973.737863] hisi_sas_v3_hw 0000:74:02.0: data underflow, rsp_code:0x70, sensekey:0xb, ASC:0x4b, ASCQ:0x6.
[80973.737864] hisi_sas_v3_hw 0000:74:02.0: Enter sas_task_abort().
[80973.737864] scsi 6:0:845:0: rejecting I/O to dead device
[80973.737868] scsi 6:0:845:0: rejecting I/O to dead device
[80973.737870] hisi_sas_v3_hw 0000:74:02.0: device id 845, IPTT 123 is complete

评论 (1)

jamyyxg 创建了缺陷

Hi jamyyxg, welcome to the openEuler Community.
I'm the Bot here serving you. You can find the instructions on how to interact with me at Here.
If you have any questions, please contact the SIG: Kernel, and any of the maintainers: @YangYingliang , @成坚 (CHENG Jian) , @jiaoff , @zhengzengkai , @刘勇强 , @wangxiongfeng , @朱科潜 , @WangShaoBo , @lujialin , @wuxu_buque , @Xu Kuohai , @冷嘲啊 , @Lingmingqiang , @yuzenghui , @juntian , @OSSIM , @陈结松 , @whoisxxx , @koulihong , @刘恺 , @hanjun-guo , @woqidaideshi , @Chiqijun , @Kefeng , @ThunderTown , @AlexGuo , @kylin-mayukun , @Zheng Zucheng , @柳歆 , @Jackie Liu , @zhujianwei001 , @郑振鹏 , @SuperSix173 , @colyli , @Zhang Yi , @htforge , @Qiuuuuu , @Yuehaibing , @xiehaocheng , @guzitao , @CTC-Xibo.Wang , @zhanghongchen , @chen wei , @Jason Zeng , @Xie XiuQi

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(2)
5329419 openeuler ci bot 1632792936
C
1
https://gitee.com/openeuler/kernel.git
git@gitee.com:openeuler/kernel.git
openeuler
kernel
kernel

搜索帮助

344bd9b3 5694891 D2dac590 5694891