PVE中Host与LXC共用NVIDIA显卡,例如Jellyfin硬件解码

2022-09-14   


因为本人目前在用的CPU性能较弱,使用Jellyfin时,经常会遇到一些格式的视频无法流畅编解码推流,所以想使用显卡来帮助硬件解码。

参考文档 Jellyfin-Documentation-Hardware Acceleration

本文使用的硬件为

  • NVIDIA Quadro P400

Host操作系统为PVE 7.2-7

首先在PVE中安装NVIDIA显卡驱动,这里要注意如果加载了nouveau驱动,是无法安装NVIDIA驱动的,记得先卸载nouveau驱动。有时候也会出现驱动安装后nvidia-smi没有返回,报错的情况,可能是安装时缺少了某些必须的软件包,这里也附上一些。

apt install pve-headers-$(uname -r)

apt update && apt install dkms git build-essential dkms cargo jq uuid-runtime -y

wget https://us.download.nvidia.com/XFree86/Linux-x86_64/515.65.01/NVIDIA-Linux-x86_64-515.65.01.run

chmod +x NVIDIA-Linux-x86_64-515.65.01.run 

./NVIDIA-Linux-x86_64-515.65.01.run

跟着GUI安装驱动后,这时候应该可以使用nvidia-smi读取到显卡状态(我也不记得要不要重启了)

root@pve:~# nvidia-smi
Wed Sep 14 12:38:36 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro P400         On   | 00000000:03:00.0 Off |                  N/A |
|  0%   45C    P8    N/A /  N/A |      1MiB /  2048MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

接下来我们可以开始创建LXC容器,因为ffmpeg有jellyfin的debian定制版,所以这里使用debian作为LXC镜像,至于创建容器的时候是否需要选择特权容器,这里笔者没有验证过。虽然在非特权容器下也能共享显卡,但是笔者没有验证能不能在非特权容器中进行解码。

这里不赘述jellyfin的安装过程,仅仅介绍显卡驱动的安装与jellyfin-ffmpeg

首先下载jellyfin-ffmpeg,然后使用dpkg安装,由于是全新系统,可能会缺失依赖包,这时候可以用apt命令修复以来,之后再进行安装

wget https://repo.jellyfin.org/releases/server/debian/versions/jellyfin-ffmpeg/5.1.1-1/jellyfin-ffmpeg5_5.1.1-1-bullseye_amd64.deb

dpkg --install jellyfin-ffmpeg5_5.1.1-1-bullseye_amd64.deb

apt --fix-broken install

dpkg --install jellyfin-ffmpeg5_5.1.1-1-bullseye_amd64.deb

安装显卡驱动与之前差不多,但是不需要安装Kernel-Header,同时在安装时加上–no-kernel-module参数

apt update && apt install dkms git build-essential dkms cargo jq uuid-runtime -y

wget https://us.download.nvidia.com/XFree86/Linux-x86_64/515.65.01/NVIDIA-Linux-x86_64-515.65.01.run

chmod +x NVIDIA-Linux-x86_64-515.65.01.run 

./NVIDIA-Linux-x86_64-515.65.01.run --no-kernel-module

安装完成后,关闭LXC容器,进入Host Shell中,修改容器配置文件
首先记住这两行命令列出的数字

root@pve:~# ls -lah /dev/nvidia*
crw-rw-rw- 1 root root 195,   0 Sep 11 00:55 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Sep 11 00:55 /dev/nvidiactl
crw-rw-rw- 1 root root 195, 254 Sep 11 00:55 /dev/nvidia-modeset
crw-rw-rw- 1 root root 511,   0 Sep 11 00:55 /dev/nvidia-uvm
crw-rw-rw- 1 root root 511,   1 Sep 11 00:55 /dev/nvidia-uvm-tools

/dev/nvidia-caps:
total 0
drwxr-xr-x  2 root root     80 Sep 11 00:56 .
drwxr-xr-x 22 root root   5.0K Sep 13 04:04 ..
cr--------  1 root root 240, 1 Sep 11 00:56 nvidia-cap1
cr--r--r--  1 root root 240, 2 Sep 11 00:56 nvidia-cap2
root@pve:~# ls -lah /dev/dri/*
crw-rw---- 1 root video  226,   0 Sep 11 00:55 /dev/dri/card0
crw-rw---- 1 root video  226,   1 Sep 11 00:55 /dev/dri/card1
crw-rw---- 1 root render 226, 128 Sep 11 00:55 /dev/dri/renderD128

/dev/dri/by-path:
total 0
drwxr-xr-x 2 root root 100 Sep 11 00:55 .
drwxr-xr-x 3 root root 120 Sep 11 00:55 ..
lrwxrwxrwx 1 root root   8 Sep 11 00:55 pci-0000:03:00.0-card -> ../card1
lrwxrwxrwx 1 root root  13 Sep 11 00:55 pci-0000:03:00.0-render -> ../renderD128
lrwxrwxrwx 1 root root   8 Sep 11 00:55 pci-0000:0b:00.0-card -> ../card0

例如195,511,226这样的数字
如果按照jellyfin官方教程的配置,那么只要分配195与226

找到自己的容器编号,修改容器设置
前三行的数字需要根据自己的数字进行修改

nano /etc/pve/nodes/pve/lxc/10x.conf

lxc.cgroup2.devices.allow: c 195:* rwm
lxc.cgroup2.devices.allow: c 511:* rwm
lxc.cgroup2.devices.allow: c 226:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=f>
lxc.mount.entry: /dev/dri/card0 dev/dri/card0 none bind,optional,create=file
lxc.mount.entry: /dev/dri/card1 dev/dri/card1 none bind,optional,create=file
lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file
lxc.mount.entry: /dev/fb0 dev/fb0 none bind,optional,create=file

之后启动容器,应该就可以在LXC的shell中输入nvidia-smi查看到显卡状态了

Q.E.D.