[OLK-5.10] bcache: kernel oops in cache_set_flush

【缺陷描述】：请补充详细的缺陷问题现象描述

一、缺陷信息

【缺陷所属的os版本】（如openEuler-22.03-LTS，参考命令"cat /etc/os-release"结果）
openEuler-22.03-LTS-SP3
openEuler-22.03-LTS-SP4
【内核版本】（如kernel-5.10.0-60.138.0.165，参考命令"uname -r"结果）
5.10.0-231.0.0.133
5.10.0-202.0.0.115
【缺陷所属软件及版本号】（如kernel-5.10.0-60.138.0.165，参考命令"rpm -q 包名"结果）
kernel-5.10.0-231.0.0.133
kernel-5.10.0-202.0.0.115
【环境信息】
硬件信息

提供跟硬件相关的信息，如架构、cpu和内存规格等
aarch64

[root@storage-aqkp-002 127.0.0.1-2024-11-10-11:47:37]# dmidecode -t system
# dmidecode 3.4
Getting SMBIOS data from sysfs.
SMBIOS 3.3.0 present.

Handle 0x0001, DMI type 1, 27 bytes
System Information
        Manufacturer: Enginetech
        Product Name: EG920A-G20
        Version: To be filled by O.E.M.
        Serial Number: EG2024100810003
        UUID: 49e72c59-d446-8ebe-eb11-3ce3b2caa1c9
        Wake-up Type: Power Switch
        SKU Number: To be filled by O.E.M.
        Family: To be filled by O.E.M.

Handle 0x0005, DMI type 32, 11 bytes
System Boot Information
        Status: No errors detected

[root@storage-aqkp-002 127.0.0.1-2024-11-10-11:47:37]# free -g
               total        used        free      shared  buff/cache   available
Mem:             691           5         687           0           1         685
Swap:              0           0           0

虚拟机场景，额外补充宿主机os版本类型
不涉及
软件信息
跟缺陷所属软件相关的其它软件版本信息（如软件包构建失败由gcc引起，请填写gcc的版本号）
不涉及
网络信息
不涉及
如果有特殊组网，请提供网络拓扑信息以及网络数据走向
不涉及
【问题复现步骤】：请描述具体的操作步骤
1、创建bache设备。
make-bcache -C /dev/nvme2n1p1 -B /dev/sda --writeback --force --wipe-bcache
/dev/sda为12T的SATA盘。
/dev/nvme2n1p1为nvme盘的第一个分区。分区大小为1024G。
parted -s --align optimal /dev/nvme2n1 mkpart primary 2048s 1024GiB
2、在bcache0上执行fio测试

cat /home/script/run-fio-randrw.sh 
bcache_name=$1
if [ -z "${bcache_name}" ];then
    echo bcache_name is empty
    exit -1
fi

fio --filename=/dev/${bcache_name} --ioengine=libaio --rw=randrw --bs=4k --size=100% --iodepth=128 --numjobs=4 --direct=1 --name=randrw --group_reporting --runtime=30 --ramp_time=5 --lockmem=1G | tee -a ./randrw-iops_k1.log

bash run-fio-randrw.sh bcache0

2、关机
poweroff
没有执行bcache数据清除操作
3、替换12T的SATA盘为16TSATA盘
4、调整nvme2n1分区大小为1536G
parted -s --align optimal /dev/nvme2n1 mkpart primary 2048s 1536GiB
分区执行完出发kernel panic
5、重启系统，不能正常进入系统。一直处于重启状态。
6、通过光盘进入rescue模式，清除nvme2n1p1 超级块信息后。再次重新启动后，可以正常进入系统。
7、重新分区，再次触发kernel panic。
【实际结果】，请描述出问题的结果和影响
kernel panic，系统不能正常启动。
【期望结果】，请描述出期望的结果和影响
系统能够正常启动。
【其他相关附件信息】
比如系统message日志/组件日志、dump信息、图片等

[    0.000000] Booting Linux on physical CPU 0x0000080000 [0x481fd010]
[    0.000000] Linux version 5.10.0-202.0.0.115.ile2312sp1.aarch64 (abuild@worker55) (gcc_old (GCC) 10.3.1, GNU ld (GNU Binutils) 2.37) #1 SMP Mon Jun 17 01:51:52 UTC 2024
[    0.000000] efi: EFI v2.70 by EDK II
[    0.000000] efi: ACPI 2.0=0x2f680018 SMBIOS 3.0=0x2f510000 MEMATTR=0x354eb018 MOKvar=0x2e0e0000 RNG=0x2f68f698 MEMRESERVE=0x2f651498
[    0.000000] ACPI: Early table checksum verification disabled
......
[  249.369289] bcache: register_bcache() error : device already registered
[  249.369415] bcache: register_bcache() error : device already registered
[  249.370308] bcache: register_bcache() error : device already registered
[  249.370517] bcache: register_bcache() error : device already registered
[  249.371315] bcache: register_bcache() error : device already registered
[  249.516326] VFS: Open an exclusive opened block device for write sdh. current [8673 sgdisk]. parent [8502 rook]
[  359.459929]  nvme2n1:
[  359.473124]  nvme2n1: p1
[  359.618056] bcache: prio_read() bad csum reading priorities
[  359.624878] bcache: bch_cache_set_error() error on f774c122-6c02-469b-b798-ca53c10efa76: IO error reading priorities, disabling caching
[  359.638311] bcache: register_cache() error nvme2n1p1: failed to run cache set
[  359.646709] bcache: register_bcache() error : failed to register device
[  359.658968] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000200
[  359.669077] Mem abort info:
[  359.672871]   ESR = 0x96000044
[  359.676929]   EC = 0x25: DABT (current EL), IL = 32 bits
[  359.683221]   SET = 0, FnV = 0
[  359.687253]   EA = 0, S1PTW = 0
[  359.691368] Data abort info:
[  359.695212]   ISV = 0, ISS = 0x00000044
[  359.700003]   CM = 0, WnR = 1
[  359.703909] user pgtable: 4k pages, 48-bit VAs, pgdp=00002040022e2000
[  359.711284] [0000000000000200] pgd=0000000000000000, p4d=0000000000000000
[  359.719262] Internal error: Oops: 0000000096000044 [#1] SMP
[  359.725760] Modules linked in: xt_set ipt_rpfilter xt_multiport iptable_raw ip_set_hash_ip ip_set_hash_net ip_set ipip tunnel4 ip_tunnel veth xt_statistic xt_nat xt_addrtype ip6table_nat ip6_tables iptable_mangle xt_physdev xt_conntrack xt_comment xt_mark iptable_filter nf_conntrack_netlink nfnetlink sch_ingress iptable_nat xt_MASQUERADE ip_tables rbd ceph libceph dns_resolver overlay openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c 8021q garp mrp bonding vfat fat dm_multipath rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_core_mod ib_iser rdma_cm iw_cm ib_cm libiscsi scsi_transport_iscsi hns_roce_hw_v2 ib_uverbs ib_core bcache dm_mod crc64 ipmi_ssif ses enclosure aes_ce_blk aes_ce_cipher realtek acpi_ipmi hisi_sas_v3_hw hibmc_drm ghash_ce hclge sha1_ce hisi_sas_main nvme drm_vram_helper hns3 ipmi_si drm_ttm_helper nvme_core libsas hnae3 ipmi_devintf ttm host_edma_drv sg scsi_transport_sas i2c_designware_platform
[  359.725845]  nfit
[  359.730936] bcache: register_bcache() error : device already registered
[  359.815384]  ipmi_msghandler i2c_designware_core hisi_uncore_ddrc_pmu hisi_uncore_hha_pmu hisi_uncore_l3c_pmu libnvdimm hisi_uncore_pmu sch_fq_codel br_netfilter bridge stp llc fuse ext4 mbcache jbd2 sd_mod t10_pi ahci libahci sha2_ce sha256_arm64 sbsa_gwdt libata megaraid_sas(OE) aes_neon_bs aes_neon_blk crypto_simd cryptd
[  359.833119] bcache: register_bcache() error : device already registered
[  359.856792] CPU: 57 PID: 7773 Comm: kworker/57:2 Kdump: loaded Tainted: G           OE     5.10.0-202.0.0.115.ile2312sp1.aarch64 #1
[  359.856793] Hardware name: Enginetech EG920A-G20/BC82AMDDRA, BIOS 6.67 11/15/2023
[  359.856819] Workqueue: events cache_set_flush [bcache]
[  359.894922] pstate: 00400009 (nzcv daif +PAN -UAO -TCO BTYPE=--)
[  359.901919] pc : cache_set_flush+0x94/0x190 [bcache]
[  359.907876] lr : cache_set_flush+0x88/0x190 [bcache]
[  359.913815] sp : ffff800046373d50
[  359.918104] x29: ffff800046373d50 x28: 0000000000000000 
[  359.924380] x27: ffff800012213c48 x26: ffffbe503baba218 
[  359.930651] x25: ffff49cc48ca0808 x24: ffff49cc06674000 
[  359.936916] x23: ffff49cc48ca0808 x22: ffff49cc48ca0000 
[  359.943172] x21: ffff49cc48ca04a8 x20: 0000000000000000 
[  359.949419] x19: 0000000000000200 x18: 0000000000000000 
[  359.955662] x17: 0000000000000000 x16: ffffbe503a531760 
[  359.961896] x15: 0000000000000004 x14: ffff49cc00004990 
[  359.968123] x13: 0000000000000000 x12: ffff49cc3dd02a40 
[  359.974342] x11: ffff49cc3dd02910 x10: ffff2a0c0040b6c2 
[  359.980556] x9 : ffffbe503a591d88 x8 : ffff49cc3dd02938 
[  359.986770] x7 : ffff49cc07f03a18 x6 : 0000000000000000 
[  359.992977] x5 : ffff29cc59c16218 x4 : ffff49cc48ca0808 
[  359.999182] x3 : 0000000000000000 x2 : ffff49cc48ca0808 
[  360.004565] bcache: bch_journal_replay() journal replay done, 11 keys in 6 entries, seq 1096092
[  360.005380] x1 : ffff49cc48ca0808 x0 : 0000000000000001 
[  360.016207] bcache: register_cache() registered cache device nvme2n1p3
[  360.022922] Call trace:
[  360.022934]  cache_set_flush+0x94/0x190 [bcache]
[  360.022946]  process_one_work+0x1d8/0x4e0
[  360.045082] bcache: register_bcache() error : device already registered
[  360.045966]  worker_thread+0x154/0x420
[  360.045970]  kthread+0x108/0x150
[  360.046495] bcache: register_bcache() error : device already registered
[  360.066044] bcache: register_bcache() error : device already registered
[  360.066162] bcache: register_bcache() error : device already registered
[  360.070249]  ret_from_fork+0x10/0x18
[  360.070254] Code: 940043e2 72001c1f 54000700 f90006f3 (f9010297) 
[  360.090288] bcache: register_bcache() error : device already registered
[  360.091355] bcache: register_bcache() error : device already registered
[  360.097327] SMP: stopping secondary CPUs
[  360.119238] Starting crashdump kernel...
[  360.136755] Bye!

【缺陷详情及分析指导参考链接】

[  359.992977] x5 : ffff29cc59c16218 x4 : ffff49cc48ca0808
[root@storage-aqkp-002 127.0.0.1-2024-11-10-11:47:37]# crash /usr/lib/debug/lib/modules/5.10.0-202.0.0.115.ile2312sp1.aarch64/vmlinux  /var/crash/127.0.0.1-2024-11-10-11\:47\:37/vmcore

crash 8.0.2-1.ile2312sp1
Copyright (C) 2002-2022  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011, 2020-2022  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
Copyright (C) 2015, 2021  VMware, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.

GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "aarch64-unknown-linux-gnu".
Type "show configuration" for configuration details.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...

WARNING: kernel version inconsistency between vmlinux and dumpfile

      KERNEL: /usr/lib/debug/lib/modules/5.10.0-202.0.0.115.ile2312sp1.aarch64/vmlinux  [TAINTED]
    DUMPFILE: /var/crash/127.0.0.1-2024-11-10-11:47:37/vmcore  [PARTIAL DUMP]
        CPUS: 96
        DATE: Sun Nov 10 11:46:56 CST 2024
      UPTIME: 00:06:00
LOAD AVERAGE: 0.15, 0.28, 0.17
       TASKS: 1763
    NODENAME: storage-aqkp-002
     RELEASE: 5.10.0-202.0.0.115.ile2312sp1.aarch64
     VERSION: #1 SMP Mon Jun 17 01:51:52 UTC 2024
     MACHINE: aarch64  (unknown Mhz)
      MEMORY: 704 GB
       PANIC: "Unable to handle kernel NULL pointer dereference at virtual address 0000000000000200"
         PID: 7773
     COMMAND: "kworker/57:2"
        TASK: ffff49cc44d69340  [THREAD_INFO: ffff49cc44d69340]
         CPU: 57
       STATE: TASK_RUNNING (PANIC)

crash> mod -s bcache /usr/lib/debug/lib/modules/5.10.0-202.0.0.115.ile2312sp1.aarch64/kernel/drivers/md/bcache/bcache.ko-5.10.0-202.0.0.115.ile2312sp1.aarch64.debug
     MODULE       NAME                           BASE          SIZE  OBJECT FILE
ffffbe501221b040  bcache                   ffffbe50121e2000  319488  /usr/lib/debug/lib/modules/5.10.0-202.0.0.115.ile2312sp1.aarch64/kernel/drivers/md/bcache/bcache.ko-5.10.0-202.0.0.115.ile2312sp1.aarch64.debug 
crash> 
crash> bt
PID: 7773     TASK: ffff49cc44d69340  CPU: 57   COMMAND: "kworker/57:2"
 #0 [ffff800046373800] machine_kexec at ffffbe5039eb54a8
 #1 [ffff8000463739b0] __crash_kexec at ffffbe503a052824
 #2 [ffff8000463739e0] crash_kexec at ffffbe503a0529cc
 #3 [ffff800046373a60] die at ffffbe5039e9445c
 #4 [ffff800046373ac0] die_kernel_fault at ffffbe5039ec698c
 #5 [ffff800046373af0] __do_kernel_fault at ffffbe5039ec6a38
 #6 [ffff800046373b20] do_page_fault at ffffbe503ac76ba4
 #7 [ffff800046373b70] do_translation_fault at ffffbe503ac76ebc
 #8 [ffff800046373b90] do_mem_abort at ffffbe5039ec68ac
 #9 [ffff800046373bc0] el1_abort at ffffbe503ac669bc
#10 [ffff800046373bf0] el1_sync_handler at ffffbe503ac671d4
#11 [ffff800046373d30] el1_sync at ffffbe5039e82230
#12 [ffff800046373d50] cache_set_flush at ffffbe50121fa4c4 [bcache]
#13 [ffff800046373da0] process_one_work at ffffbe5039f5af68
#14 [ffff800046373e00] worker_thread at ffffbe5039f5b3c4
#15 [ffff800046373e50] kthread at ffffbe5039f634b8
crash> dis cache_set_flush+0x94
0xffffbe50121fa4c8 <cache_set_flush+148>:       str     x23, [x20, #512]
crash> dis -s cache_set_flush+0x94
FILE: ./include/linux/list.h
LINE: 71

  66    {
  67            if (!__list_add_valid(new, prev, next))
  68                    return;
  69    
  70            next->prev = new;
* 71            new->next = next;
  72            new->prev = prev;
  73            WRITE_ONCE(prev->next, new);
  74    }

crash>

Hi cheliequan, welcome to the openEuler Community.
I'm the Bot here serving you. You can find the instructions on how to interact with me at Here.
If you have any questions, please contact the SIG: Kernel, and any of the maintainers.

请提供复现问题的详细步骤，包括所有的命令行操作。

colyli，您好！
三台服务器同样操作，替换12T盘为16T盘，只有一台服务器触发bug。现场操作步骤如下：
1、创建bache设备。
make-bcache -C /dev/nvme2n1p1 -B /dev/sda --writeback --force --wipe-bcache
/dev/sda为12T的SATA盘。
/dev/nvme2n1p1为nvme盘的第一个分区。分区大小为1024G。
分区命令为 parted -s --align optimal /dev/nvme2n1 mkpart primary 2048s 1024GiB
2、在bcache0上执行fio测试

cat /home/script/run-fio-randrw.sh 
bcache_name=$1
if [ -z "${bcache_name}" ];then
    echo bcache_name is empty
    exit -1
fi

fio --filename=/dev/${bcache_name} --ioengine=libaio --rw=randrw --bs=4k --size=100% --iodepth=128 --numjobs=4 --direct=1 --name=randrw --group_reporting --runtime=30 --ramp_time=5 --lockmem=1G | tee -a ./randrw-iops_k1.log

多次执行bash run-fio-randrw.sh bcache0
2、关机
poweroff
没有执行bcache数据清除操作
3、替换12T的SATA盘为16TSATA盘
关机后拔掉12T硬盘，替换成16T的硬盘。
4、调整nvme2n1分区大小为1536G
parted -s --align optimal /dev/nvme2n1 mkpart primary 2048s 1536GiB
分区执行完出发kernel panic
5、重启系统，不能正常进入系统。一直处于重启状态。
6、通过光盘进入rescue模式，清除nvme2n1p1 超级块信息后。再次重新启动后，可以正常进入系统。
wipefs -af /dev/nvme2n1p1
7、重新分区，再次触发kernel panic。
parted -s --align optimal /dev/nvme2n1 mkpart primary 2048s 1536GiB
在另外两台服务器上执行同样操作，未触发panic。
出问题的服务器，加上cache_set结构体的root为空判断后，能够正常进入系统。

1，硬盘替换为16T之后，之前的数据呢？
2，更换后端设备，cache设备需要重建。这里的操作是不支持的。

1、之前数据没有清除,没有任何处理。因为是关机后再执行替换后端设备12T sata硬盘为为16T sata硬盘，bcache设备已经自动停止；所以在这里忽略了对cache设备元数据信息清除操作。在这里的操作是先分区，再执行擦除超级块操作，分区完再执行创建bcache设备。
也就是
parted -s --align optimal /dev/nvme2n1 mkpart primary 2048s 1536GiB
wipefs -af /dev/nvme2n1p1
make-bcache -C /dev/nvme2n1p1 -B /dev/sda --writeback --force --wipe-bcache
2、是的。原打算重新分区后再重建bcache。分区的过程中导致系统panic，一直处于重启的状态。这可能是误操作，但是我们期望误操作的最差结果是bcache设备创建失败，而不是系统不可用。
我知道您说的意思。这种情况下，应该先擦除超级快再分区，再重新创建bcache。我们理解--wipe-bcache --force参数会强制擦除bcache设备的残留信息，所以在这里执行了非标准的操作。
3、总结来说，正确的做法是否是这样：
（1）关机bcache
（2）擦除nvme超级块信息
（3）重新分区
(4）制作bcache设备

GVP openEuler/kernel

内容风险标识

评论 (5)

GVPopenEuler/kernel

内容风险标识