发布于 ,更新于 

适用于 Linux Debian 12 安装驱动 Tesla P4 P40 GPU 显卡 Cuda & cuDNN

本文测试于 Linux Debian 12 安装Tesla P4 P40等GPU的显卡Cuda和驱动

在pve 8.2中使用q35(不可使用i440fx)机型创建的虚拟机中安装Nvidia gpu驱动的教程。

image
image

检查显卡存在

通常可以使用 lspci 命令来识别已安装显卡的 NVIDIA 图形处理单元 (GPU) 系列/代号。例如:

$ lspci | grep NVIDIA

为apt允许非自由软件源

vim.tiny /etc/apt/sources.list

添加 "contrib", "non-free" 和 "non-free-firmware" 组件到 /etc/apt/sources.list,例如:

Debian Bookworm

deb http://deb.debian.org/debian/ bookworm main contrib non-free non-free-firmware

# 对于中国用户更换清华tuna源:
vim.tiny /etc/apt/sources.list

deb https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm main contrib non-free non-free-firmware

deb-src https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm main contrib non-free non-free-firmware

deb https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm-updates main contrib non-free non-free-firmware

deb-src https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm-updates main contrib non-free non-free-firmware

deb https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm-backports main contrib non-free non-free-firmware

deb-src https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm-backports main contrib non-free non-free-firmware

deb https://mirrors.tuna.tsinghua.edu.cn/debian-security bookworm-security main contrib non-free non-free-firmware

deb-src https://mirrors.tuna.tsinghua.edu.cn/debian-security bookworm-security main contrib non-free non-free-firmware

更新apt

apt update -y

安装显卡驱动

apt install nvidia-driver firmware-misc-nonfree

重启电脑

检查 Debian 12 上是否安装了显卡驱动

nvidia-smi

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla P4 On | 00000000:03:00.0 Off | Off |
| N/A 38C P8 6W / 75W | 0MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+

显卡驱动已安装完毕

安装Cuda

apt install nvidia-cuda-dev nvidia-cuda-toolkit

检查 Debian 12 上是否安装了 NVIDIA CUDA

nvcc --version

安装 NVIDIA cuDNN

apt install nvidia-cudnn

看到窗口后,按 键同意。

NVIDIA cuDNN 库需要从 NVIDIA 官方网站下载。需要一段时间。

关闭ECC

通过nvidia-smi | grep Tesla查看前面GPU编号

d@d:/fuck$ nvidia-smi | grep Tesla
| 0 Tesla P40 On | 00000000:03:00.0 Off | Off |

nvidia-smi -i n -e 0/1 可关闭(0)/开启(1) , n是GPU的编号。

执行关闭ECCsudo nvidia-smi -i 0 -e 0重启后该设置生效。