fixed #I97R4T:【20.03-LTS-SP1~SP4】4.19 kernel加载并卸载vkms模块即可导致系统崩溃重启

[  134.881819] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
[  134.881836] PGD 10223b067 P4D 0
[  134.881846] Oops: 0000 [#1] SMP NOPTI
[  134.881856] CPU: 0 PID: 5205 Comm: rmmod Kdump: loaded Tainted: G           OE     4.19.90-2310.4.0.0223.u170.fos22.x86_64 #4
[  134.881880] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org04/01/2014
[  134.881906] RIP: 0010:kernfs_find_ns+0x11/0xb0
[  134.881917] Code: 0f 85 b0 fe ff ff e9 41 fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00 00 41 5541 54 48 85 d2 55 53 0f 95 c1 <0f> b7 47 70 49 89 d4 49 89 f5 66 83 e0 20 0f 95 c2 38 d1 75 4f 48
[  134.881955] RSP: 0018:ff77220000cebd58 EFLAGS: 00010246
[  134.881967] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[  134.881982] RDX: 0000000000000000 RSI: ffffffff836c5fc8 RDI: 0000000000000000
[  134.881998] RBP: ffffffff836c5fc8 R08: 00000000000006a1 R09: 000000000000014f
[  134.882013] R10: fffbc08004067e00 R11: ff374e010533c4b0 R12: 0000000000000000
[  134.882029] R13: ffffffff83baf0e0 R14: ff374e010578b820 R15: 0000000000000000
[  134.882044] FS:  00007fcfbbeb9740(0000) GS:ff374e017fc00000(0000) knlGS:0000000000000000
[  134.882062] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  134.882075] CR2: 0000000000000070 CR3: 0000000109964003 CR4: 0000000000761ef0
[  134.882094] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  134.882115] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  134.882134] PKRU: 55555554
[  134.882142] Call Trace:
[  134.882154]  kernfs_find_and_get_ns+0x2c/0x50
[  134.882168]  sysfs_unmerge_group+0x18/0x60
[  134.882181]  dpm_sysfs_remove+0x1d/0x60
[  134.882194]  device_del+0x9e/0x3c0
[  134.882206]  platform_device_del.part.12+0x1e/0x80
[  134.882220]  platform_device_unregister+0x13/0x20
[  134.882246]  vkms_release+0x15/0x30 [vkms]
[  134.882905]  __drm_atomic_helper_disable_all.constprop.29+0x140/0x160 [drm_kms_helper]
[  134.883579]  drm_atomic_helper_shutdown+0x50/0xa0 [drm_kms_helper]
[  134.884247]  vkms_release+0x1d/0x30 [vkms]
[  134.884883]  vkms_exit+0x29/0x857 [vkms]
[  134.885540]  __x64_sys_delete_module+0x13f/0x280
[  134.886179]  do_syscall_64+0x5f/0x240
[  134.886807]  entry_SYSCALL_64_after_hwframe+0x5c/0xc1

Cause of the problem: When the vkms module was removed, vkms_release was called twice. In the first call, dev->kobj->sd was released in the platform_device_unregister function. In the second call, dev->kobj-> was accessed again. sd, resulting in a null pointer.

Referring to the implementation of the 5.10 kernel, __drm_atomic_helper_disable_all is a key function of vkms_release. It will call drm_atomic_state_init, which faces problems caused by dev->ref counting.
Therefore, we take the drm_atomic_helper_shutdown function from vkms_release and place it before the init function drm_dev_put, so that when the module is initialized, dev->ref has been initialized to 1. When drm_atomic_helper_shutdown is called, dev->ref is increased by 1 again, and the drm_atomic_state_put at the end of the drm_atomic_helper_shutdown function When calling drm_dev_put, after dev->ref is decremented by 1, if it is not 0, vkms_release will not be called. The entire process only calls drm_dev_put in the final stage of the module exit function. At this time, dev->ref is 1, and after decrementing 1, it is 0. Call vkms_release. Therefore, the entire exit process only calls vkms_release once.