很奇怪,本来能使用的,放个假回来就用不了了。
排查了以下所有步骤最终解决。
我的Ubuntu版本:Ubuntu22
- nvcc -v:有。如果没有的话你需要安装“sudo apt-get install nvidia-cuda-toolkit”,其他问题请去别的博客查。
sudo apt-get install nvidia-cuda-toolkit 正在读取软件包列表... 完成 正在分析软件包的依赖关系树... 完成 正在读取状态信息... 完成 nvidia-cuda-toolkit 已经是最新版 (12.0.140~12.0.1-4build4)。 升级了 0 个软件包,新安装了 0 个软件包,要卸载 0 个软件包,有 196 个软件包未被升级。
- nvidia-smi:没有,如果使用zsh会提示command not found,如果使用bash会提示找不到命令,但可以通过以下软件包安装。
nvidia-smi 找不到命令 “nvidia-smi”,但可以通过以下软件包安装它: apt install nvidia-utils-470 # version 470.256.02-0ubuntu0.24.04.1, or apt install nvidia-utils-470-server # version 470.256.02-0ubuntu0.24.04.1 apt install nvidia-utils-535 # version 535.183.01-0ubuntu0.24.04.1 apt install nvidia-utils-535-server # version 535.216.01-0ubuntu0.24.04.1 apt install nvidia-utils-550 # version 550.120-0ubuntu0.24.04.1 apt install nvidia-utils-525 # version 525.147.05-0ubuntu1 apt install nvidia-utils-525-server # version 525.147.05-0ubuntu1 apt install nvidia-utils-550-server # version 550.127.05-0ubuntu0.24.04.1
- 尝试安装nvidia-driver:无法定位软件包
- 添加软件源:
sudo add-apt-repository ppa:graphics-drivers/ppa sudo apt-get update
- 再次尝试安装nvidia-driver:仍然无法定位软件包
- 随便装一个:sudo apt-get install nvidia-utils-<随便找了个高的版本号>
- 再次运行nvidia-smi:发现好歹有输出了,但是提示版本不匹配。
nvidia-smi Failed to initialize NVML: Driver/library version mismatch NVML library version: 550.144
- 网上说重启一下之后驱动会自动升级,于是我直接
sudo reboot
了一下,发现还是没用。 - 问了一下各种大模型,他们建议我查看一下驱动版本,然后选择带有"recommended"标识的版本(例如nvidia-driver-550):ubuntu-drivers devices
嗨嗨嗨,原来是要加版本号才能找到nvidia-driver啊,怪不得装不上。(base) ➜ pure_experiments_env ubuntu-drivers devices udevadm hwdb is deprecated. Use systemd-hwdb instead. udevadm hwdb is deprecated. Use systemd-hwdb instead. udevadm hwdb is deprecated. Use systemd-hwdb instead. udevadm hwdb is deprecated. Use systemd-hwdb instead. udevadm hwdb is deprecated. Use systemd-hwdb instead. udevadm hwdb is deprecated. Use systemd-hwdb instead. udevadm hwdb is deprecated. Use systemd-hwdb instead. udevadm hwdb is deprecated. Use systemd-hwdb instead. udevadm hwdb is deprecated. Use systemd-hwdb instead. udevadm hwdb is deprecated. Use systemd-hwdb instead. udevadm hwdb is deprecated. Use systemd-hwdb instead. udevadm hwdb is deprecated. Use systemd-hwdb instead. udevadm hwdb is deprecated. Use systemd-hwdb instead. udevadm hwdb is deprecated. Use systemd-hwdb instead. == /sys/devices/pci0000:3a/0000:3a:00.0/0000:3b:00.0 == modalias : pci:v00 vendor : NVIDIA Corporation model : AD102 [GeForce RTX 4090] manual_install: True driver : nvidia-driver-560-open - third-party non-free driver : nvidia-driver-545-open - third-party non-free driver : nvidia-driver-550 - third-party non-free driver : nvidia-driver-550-open - third-party non-free driver : nvidia-driver-535-server-open - distro non-free driver : nvidia-driver-535-open - third-party non-free driver : nvidia-driver-535-server - distro non-free driver : nvidia-driver-570-open - third-party non-free driver : nvidia-driver-565-open - third-party non-free driver : nvidia-driver-560 - third-party non-free recommended driver : nvidia-driver-535 - third-party non-free driver : nvidia-driver-570 - third-party non-free driver : nvidia-driver-565 - third-party non-free driver : nvidia-driver-545 - third-party non-free driver : xserver-xorg-video-nouveau - distro free builtin
- 我装完560发现还是nvidia-smi指令找不到,看了下大模型的回复还是劝重启,我感觉这次可能真的需要重启,就又重启了一下:sudo reboot
- 发现果然好了!!!可喜可贺!可喜可贺啊!
末:最近博客之星投票,我的链接是:https://www.csdn.net/blogstar2024/detail/151,可以帮忙投个票吗,想拿前100的实体证书,一直在130左右浮动,就差一点点。
本账号所有文章均为原创,欢迎转载,请注明文章出处:https://shandianchengzi.blog.csdn.net/article/details/145640869。百度和各类采集站皆不可信,搜索请谨慎鉴别。技术类文章一般都有时效性,本人习惯不定期对自己的博文进行修正和更新,因此请访问出处以查看本文的最新版本。