399 Star 1.4K Fork 1.3K

GVPopenEuler / kernel

 / 详情

【20.03-LTS-SP3】计算机概率性待机唤醒后网络异常(X86)

修复中
缺陷
创建于  
2022-02-25 11:29

【问题描述】
计算机概率性待机唤醒后网络异常,系统唤醒后网络异常现象描述如下:
1) 如果系统网络使用的是DHCP方式,则系统无法获取到ip地址.
2) 如果系统使用的是静态ip地址,则系统无法ping通外部,外部也没法ping通内部.
3) 故障时down掉网卡再up,网络功能恢复正常.
4) 去除933b5fc092f67cd4f7bdde31da20b2f211a071b8和2f554490fcd189fd7a49d5dac95742a8e89b43d0两次提交内容,测试功能正常.

【硬件信息】
台式机: DELL optiplex 3050
网卡: RTL8168集成网卡 + RTL8161独立网卡

【软件信息】
1) OS版本及分支:openEuler-20.03-LTS-SP3
2) openEuler-latest系统信息:
openeulerversion=openEuler-20.03-LTS-SP3
compiletime=2022-01-29-12-01-40
gccversion=7.3.0-20211227.44.oe1
kernelversion=4.19.90-2201.4.0.0135.oe1
openjdkversion=1.8.0.312.b07-10.oe1
3) 内核版本: 4.19.90-2202.3.0.0138.oe1.x86_64

【问题复现步骤】
1、配置系统网络为dhcp方式.
2、执行systemctl suspend命令让系统休眠.
3、系统休眠成功后唤醒系统.
4、ifconfig查看系统ip,系统一直获取不到ip地址.
出现概率:20% (使用DELL optiplex 3050集成网卡概率低点,使用8168独立网卡概率较高,可达50%)
补充: 故障时,如果系统使用的是静态ip地址,则系统无法ping通外部,外部也没法ping通内部.

【预期结果】
待机唤醒后网络正常

【实际结果】
概率性的网络不正常

【附件信息】

  1. 网卡pci信息
    #lspci -v
    01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
    Subsystem: Dell Device 07a3
    Flags: bus master, fast devsel, latency 0, IRQ 16
    I/O ports at e000 [size=256]
    Memory at f7104000 (64-bit, non-prefetchable) [size=4K]
    Memory at f7100000 (64-bit, non-prefetchable) [size=16K]
    Capabilities: [40] Power Management version 3
    Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
    Capabilities: [70] Express Endpoint, MSI 01
    Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
    Capabilities: [100] Advanced Error Reporting
    Capabilities: [140] Virtual Channel
    Capabilities: [160] Device Serial Number 01-00-00-00-68-4c-e0-00
    Capabilities: [170] Latency Tolerance Reporting
    Capabilities: [178] L1 PM Substates
    Kernel driver in use: r8169
    Kernel modules: r8169

02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. Device 8161 (rev 15)
Subsystem: Realtek Semiconductor Co., Ltd. Device 8168
Flags: bus master, fast devsel, latency 0, IRQ 17
I/O ports at d000 [size=256]
Memory at f7004000 (64-bit, non-prefetchable) [size=4K]
Memory at f7000000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [70] Express Endpoint, MSI 01
Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
Capabilities: [d0] Vital Product Data
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Virtual Channel
Capabilities: [160] Device Serial Number 01-00-00-00-68-4c-e0-00
Capabilities: [170] Latency Tolerance Reporting
Capabilities: [178] L1 PM Substates
Kernel driver in use: r8169
Kernel modules: r8169

2)网卡id信息
#lspci -n
01:00.0 0200: 10ec:8168 (rev 15)
02:00.0 0200: 10ec:8161 (rev 15)

【目前为止的定位信息】
去除933b5fc092f67cd4f7bdde31da20b2f211a071b8和2f554490fcd189fd7a49d5dac95742a8e89b43d0两次提交内容,测试功能正常.

测试步骤如下:

  1. 下载最新发布的内核代码4.19.90-2202.4.0.tar.gz.
    https://toscode.gitee.com/openeuler/kernel/repository/blazearchive/4.19.90-2202.4.0.tar.gz?Expires=1645754469&Signature=vA3Dj9JXBEmQ88b%2B4AHCAuR30eFV%2FN2gwAkx6EutjKA%3D

  2. 使用系统版本对应的配置编译内核
    #cp /boot/config-4.19.90-2202.3.0.0138.oe1.x86_64 .config
    #make -j8
    #make -j8 modules_install
    #cp arch/x86/boot/bzImage /boot/vmlinuz-openEuler-4.19.90+
    #dracut -f /boot/initramfs-openEuler-4.19.90+.img --kver 4.19.90+

  3. 使用vmlinuz-openEuler-4.19.90+与initramfs-openEuler-4.19.90+.img引导系统进行测试.
    测试结果: 故障稳定复现

  4. 移除补丁:

  1. 933b5fc092f67cd4f7bdde31da20b2f211a071b8
    diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
    index e561ef1..6019e0a 100644
    --- a/drivers/net/phy/phy.c
    +++ b/drivers/net/phy/phy.c
    @@ -861,6 +861,8 @@ void phy_stop(struct phy_device *phydev)
    out_unlock:
    mutex_unlock(&phydev->lock);

+ phy_state_machine(&phydev->state_queue.work);
+
/* Cannot call flush_scheduled_work() here as desired because
* of rtnl_lock(), but PHY_HALTED shall guarantee phy_change()
* will not reenable interrupts.

  1. 2f554490fcd189fd7a49d5dac95742a8e89b43d0
    diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
    index 6019e0a..51e40a9 100644
    --- a/drivers/net/phy/phy.c
    +++ b/drivers/net/phy/phy.c
    @@ -862,6 +862,7 @@ void phy_stop(struct phy_device *phydev)
    mutex_unlock(&phydev->lock);

     phy_state_machine(&phydev->state_queue.work);
    

+ phy_stop_machine(phydev);

    /* Cannot call flush_scheduled_work() here as desired because
     * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change()

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index a64a624..1117355 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -825,8 +825,6 @@ void phy_disconnect(struct phy_device *phydev)
if (phydev->irq > 0)
phy_stop_interrupts(phydev);

- phy_stop_machine(phydev);
-
phydev->adjust_link = NULL;

    phy_detach(phydev);
  1. 移除后的diff
    diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
    index 51e40a91d..e561ef1e8 100644
    --- a/drivers/net/phy/phy.c
    +++ b/drivers/net/phy/phy.c
    @@ -861,9 +861,6 @@ void phy_stop(struct phy_device *phydev)
    out_unlock:
    mutex_unlock(&phydev->lock);

- phy_state_machine(&phydev->state_queue.work);
- phy_stop_machine(phydev);
-
/* Cannot call flush_scheduled_work() here as desired because
* of rtnl_lock(), but PHY_HALTED shall guarantee phy_change()
* will not reenable interrupts.
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 111735531..a64a62409 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -825,6 +825,8 @@ void phy_disconnect(struct phy_device *phydev)
if (phydev->irq > 0)
phy_stop_interrupts(phydev);

+ phy_stop_machine(phydev);
+
phydev->adjust_link = NULL;

    phy_detach(phydev);
  1. 编译后测试网络功能正常.

  2. 补充信息
    使用之前系统版本的内核及kernel org v4.19.230版本测试网络功能正常,它们都没有包含上面2个提交内容.

评论 (2)

刘达林 创建了缺陷

Hi liu_darling, welcome to the openEuler Community.
I'm the Bot here serving you. You can find the instructions on how to interact with me at Here.
If you have any questions, please contact the SIG: Kernel, and any of the maintainers: @YangYingliang , @pi3orama , @成坚 (CHENG Jian) , @Qiuuuuu , @zhengzengkai , @gogooo , @Xie XiuQi

openeuler-ci-bot 添加了
 
sig/Kernel
标签
刘达林 修改了描述
刘达林 修改了描述
刘达林 修改了标题
刘达林 修改了标题
  1. 可以用ethtool看下异常时网口的状态,这两个补丁都只改了phy模块,如果有影响的话应该也只影响phy协商
  2. 能否确认两个补丁中具体是哪个补丁合入后出现异常?
sanglipeng 负责人设置为zhangchangzhong
sanglipeng 任务状态待办的 修改为修复中
zhangchangzhong 里程碑20.03-SP3-Kernel-Defect 修改为未设置
zhangchangzhong 关联分支设置为openEuler-1.0-LTS

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(3)
5329419 openeuler ci bot 1632792936
C
1
https://gitee.com/openeuler/kernel.git
git@gitee.com:openeuler/kernel.git
openeuler
kernel
kernel

搜索帮助

53164aa7 5694891 3bd8fe86 5694891