AMD显卡如何在WSL子系统下跑PyTorch

准备工作 参考文献:https://github.com/ROCm/ROCm/issues/3563#issuecomment-2423673336 AMD RCOM主驱动(Ubuntu24.04):Index of /amdgpu-install/6.3.4/ubuntu/noble/ AMD W

准备工作

参考文献:https://github.com/ROCm/ROCm/issues/3563#issuecomment-2423673336

AMD RCOM主驱动(Ubuntu24.04):Index of /amdgpu-install/6.3.4/ubuntu/noble/

AMD WSL增强驱动:Index of /amdgpu/6.3.4/ubuntu/pool/main/h/

!!!重要提醒!!!

WSL版本:2

Ubuntu版本:24.04 代号noble

ROCm驱动版本:6.3.4

Miniconda版本:3

所有提及到的下载、安装包的版本都是跟上面提到的对应上的,不能乱下载!否则降低成功率

要求

  • AMD显卡一枚,最低要求6800XT以上的显卡,我的是7900GRE(超频版)

  • 系统,最低Winodws10 22H2

开始下载

首先是在微软应用商店找到Ubuntu24.04并安装,切换到root账号

验证命令执行一下

root@ZBH:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 24.04.1 LTS
Release:        24.04
Codename:       noble

下载安装两个驱动包:

root@ZBH:~# wget https://repo.radeon.com/amdgpu/6.3.4/ubuntu/pool/main/h/hsa-runtime-rocr4wsl-amdgpu/hsa-runtime-rocr4wsl-amdgpu_24.30-2127960.24.04_amd64.deb
root@ZBH:~# wget https://repo.radeon.com/amdgpu-install/6.3.4/ubuntu/noble/amdgpu-install_6.3.60304-1_all.deb
root@ZBH:~# sudo apt install hsa-runtime-rocr4wsl-amdgpu_24.30-2127960.24.04_amd64.deb
root@ZBH:~# sudo apt install amdgpu-install_6.3.60304-1_all.deb

然后执行amdgpu的rocm安装包

root@ZBH:~$ amdgpu-install -y --usecase=wsl,rocm,amf --opencl=rocr --vulkan=amdvlk,pro --no-dkms --accept-eula

安装Miniconda3

root@ZBH:~# wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
root@ZBH:~# chmod a+x Miniconda3-latest-Linux-x86_64.sh
root@ZBH:~# ./Miniconda3-latest-Linux-x86_64.sh
root@ZBH:~# miniconda3/bin/conda init
root@ZBH:~# source ~/.bashrc
(base) root@ZBH:~#

构建pytorch环境

使用pip命令安装pytorch

(base) root@ZBH:~# conda create -n pytorch python==3.12
(base) root@ZBH:~# conda activate pytorch
(pytorch) root@ZBH:~#
(pytorch) root@ZBH:~# pip3 install torch==2.3.0 torchvision==0.18.0 pytorch_triton_rocm==2.3.0 onnxruntime_rocm==1.18.0 tensorflow_rocm==2.16.2 -f https://repo.radeon.com/rocm/manylinux/rocm-rel-6.3.4/

替换动态库

(pytorch) root@ZBH:~# pip show torch | grep Location | awk -F ": " '{print $2}'
/root/miniconda3/envs/pytorch/lib/python3.12/site-packages
(pytorch) root@ZBH:~# cd /root/miniconda3/envs/pytorch/lib/python3.12/site-packages/torch/lib/
(pytorch) root@ZBH:~/miniconda3/envs/pytorch/lib/python3.12/site-packages/torch/lib# rm libhsa-runtime64.so*
(pytorch) root@ZBH:~/miniconda3/envs/pytorch/lib/python3.12/site-packages/torch/lib# cp /opt/rocm/lib/libhsa-runtime64.so.1.14.0 libhsa-runtime64.so

验证一下

进入python交互式界面,尝试引入torch库,然后打印一下cuda的可用情况,如果显示True,那么恭喜你,成功把AMD的GPU当作英伟达的用了,任何模型不用改任何一行代码开箱即用。性能可能没有英伟达的好,但是起码比CPU强个几倍的

(pytorch) root@ZBH:~# python
Python 3.12.0 | packaged by Anaconda, Inc. | (main, Oct  2 2023, 17:29:18) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.cuda.is_available())
True
>>>

解决问题

#Anaconda 的一个小问题
测试导入torch包,但是发现提示GLIBC的问题时

python
Python 3.10.0 (default, Mar 3 2022, 09:58:08) [GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

>>> import torch
Traceback (most recent call last):
File "", line 1, in
File "/root/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/init.py", line 237, in
from torch._C import * # noqa: F403
ImportError: /root/miniconda3/envs/pytorch/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by /home/tao/anaconda3/envs/pytorch/lib/python3.10/site-packages/torch/lib/libhsa-runtime64.so)
exit()

-----解决办法-----

因为miniconda或者anaconda下载的stdc++动态库文件跟子系统里面的对不上,很好解决~

使用locate命令先找到自己的Ubuntu系统下的libstdc++动态库文件的位置,然后替换掉我的miniconda里面pytorch的对应动态库即可。

(pytorch) root@ZBH:~# updatedb --prunepaths='/mnt'
(pytorch) root@ZBH:~# apt install plocate
(pytorch) root@ZBH:~# locate libstdc++.so.6
(pytorch) root@ZBH:~# mv /root/miniconda3/envs/pytorch/lib/libstdc++.so.6. /root/miniconda3/envs/pytorch/lib/libstdc++.so.6.bak
(pytorch) root@ZBH:~# mv /root/miniconda3/envs/pytorch/lib/libstdc++.so.6.0.29 /root/miniconda3/envs/pytorch/lib/libstdc++.so.6.0.29.bak
(pytorch) root@ZBH:~# cp /usr/lib/x86_64-linux-gnu/libstdc++.so.6 /root/miniconda3/envs/pytorch/lib/libstdc++.so.6
(pytorch) root@ZBH:~# cp /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.33 /root/miniconda3/envs/pytorch/lib/libstdc++.so.6.0.29

运行示例

#在物理磁盘拉取代码
(pytorch) root@ZBH:~# cd /mnt/g/
(pytorch) root@ZBH:/mnt/g# git clone https://github.com/pytorch/examples.git
Cloning into 'examples'...

LICENSED UNDER CC BY-NC-SA 4.0
Comment