代码拉取完成,页面将自动刷新
本文作者丁辉
查看显卡型号
lspci | grep -i nvidia
# 或
lspci | grep -i vga
结果
[root@offends ~]# lspci | grep -i nvidia
00:08.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1)
根据自己的显卡型号去下载驱动
wget https://cn.download.nvidia.com/tesla/440.95.01/NVIDIA-Linux-x86_64-440.95.01.run
部署
bash NVIDIA-Linux-x86_64-*.run
测试效果
nvidia-smi
结果
[root@offends ~]#
Mon Oct 2 16:22:37 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.106.00 Driver Version: 460.106.00 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:00:08.0 Off | 0 |
| N/A 30C P8 9W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
有部署文件情况下
bash NVIDIA-Linux-x86_64-*.run --uninstall
没有原部署文件的情况下
/usr/bin/nvidia-uninstall
经过多次卸载安重新装遇到报错
ERROR: An NVIDIA kernel module 'nvidia' appears to already be loaded in your kernel. This may be because it is in use (for example, by an X server, a CUDA program, or the NVIDIA Persistence Daemon), but this may also happen
if your kernel was configured without support for module unloading. Please be sure to exit any programs that may be using the GPU(s) before attempting to upgrade your driver. If no GPU-based programs are running, you
know that your kernel supports module unloading, and you still receive this message, then an error may have occured that has corrupted an NVIDIA kernel module's usage count, for which the simplest remedy is to reboot
your computer.
解决方法直接 reboot
重启服务器解决成功率 99% 哈哈
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。