适用于 Linux Debian 12 安装驱动 Tesla P4 P40 GPU 显卡 Cuda & cuDNN

默认分类 · 2024-08-05

本文测试于 Linux Debian 12 安装Tesla P4 P40等GPU的显卡Cuda和驱动

在pve 8.2中使用q35(不可使用i440fx)机型创建的虚拟机中安装Nvidia gpu驱动的教程。

image
image

检查显卡存在

通常可以使用 lspci 命令来识别已安装显卡的 NVIDIA 图形处理单元 (GPU) 系列/代号。例如:

$ lspci | grep NVIDIA

为apt允许非自由软件源

vim.tiny /etc/apt/sources.list

# 添加 "contrib", "non-free" 和 "non-free-firmware" 组件到 /etc/apt/sources.list,例如:

# Debian Bookworm
deb http://deb.debian.org/debian/ bookworm main contrib non-free non-free-firmware
# 对于中国用户更换清华tuna源:
vim.tiny /etc/apt/sources.list

deb https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm main contrib non-free non-free-firmware
# deb-src https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm main contrib non-free non-free-firmware

deb https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm-updates main contrib non-free non-free-firmware
# deb-src https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm-updates main contrib non-free non-free-firmware

deb https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm-backports main contrib non-free non-free-firmware
# deb-src https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm-backports main contrib non-free non-free-firmware

deb https://mirrors.tuna.tsinghua.edu.cn/debian-security bookworm-security main contrib non-free non-free-firmware
# deb-src https://mirrors.tuna.tsinghua.edu.cn/debian-security bookworm-security main contrib non-free non-free-firmware

更新apt

apt update -y

安装显卡驱动

apt install nvidia-driver firmware-misc-nonfree

重启电脑

检查 Debian 12 上是否安装了显卡驱动

nvidia-smi

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI                                                                            |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla P4                       On  | 00000000:03:00.0 Off |                  Off |
| N/A   38C    P8               6W /  75W |      0MiB /  8192MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

显卡驱动已安装完毕

安装Cuda

apt install nvidia-cuda-dev nvidia-cuda-toolkit

检查 Debian 12 上是否安装了 NVIDIA CUDA

nvcc --version

安装 NVIDIA cuDNN

apt install nvidia-cudnn

看到窗口后,按 键同意。

NVIDIA cuDNN 库需要从 NVIDIA 官方网站下载。需要一段时间。

关闭ECC

通过nvidia-smi | grep Tesla查看前面GPU编号

d@d:/fuck$ nvidia-smi | grep Tesla
|   0  Tesla P40                      On  | 00000000:03:00.0 Off |                  Off |
-----------------------------------------------------------------------------------------
nvidia-smi -i n -e 0/1 可关闭(0)/开启(1) , n是GPU的编号。

执行关闭ECCsudo nvidia-smi -i 0 -e 0重启后该设置生效。

Theme Jasmine by Kent Liao