准备工作
参考文献:https://github.com/ROCm/ROCm/issues/3563#issuecomment-2423673336
AMD RCOM主驱动(Ubuntu24.04):Index of /amdgpu-install/6.3.4/ubuntu/noble/
AMD WSL增强驱动:Index of /amdgpu/6.3.4/ubuntu/pool/main/h/
!!!重要提醒!!!
WSL版本:2
Ubuntu版本:24.04 代号noble
ROCm驱动版本:6.3.4
Miniconda版本:3
所有提及到的下载、安装包的版本都是跟上面提到的对应上的,不能乱下载!否则降低成功率
要求
AMD显卡一枚,最低要求6800XT以上的显卡,我的是7900GRE(超频版)
系统,最低Winodws10 22H2
开始下载
首先是在微软应用商店找到Ubuntu24.04并安装,切换到root账号
验证命令执行一下
root@ZBH:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 24.04.1 LTS
Release: 24.04
Codename: noble
下载安装两个驱动包:
root@ZBH:~# wget https://repo.radeon.com/amdgpu/6.3.4/ubuntu/pool/main/h/hsa-runtime-rocr4wsl-amdgpu/hsa-runtime-rocr4wsl-amdgpu_24.30-2127960.24.04_amd64.deb
root@ZBH:~# wget https://repo.radeon.com/amdgpu-install/6.3.4/ubuntu/noble/amdgpu-install_6.3.60304-1_all.deb
root@ZBH:~# sudo apt install hsa-runtime-rocr4wsl-amdgpu_24.30-2127960.24.04_amd64.deb
root@ZBH:~# sudo apt install amdgpu-install_6.3.60304-1_all.deb
然后执行amdgpu的rocm安装包
root@ZBH:~$ amdgpu-install -y --usecase=wsl,rocm,amf --opencl=rocr --vulkan=amdvlk,pro --no-dkms --accept-eula
安装Miniconda3
root@ZBH:~# wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
root@ZBH:~# chmod a+x Miniconda3-latest-Linux-x86_64.sh
root@ZBH:~# ./Miniconda3-latest-Linux-x86_64.sh
root@ZBH:~# miniconda3/bin/conda init
root@ZBH:~# source ~/.bashrc
(base) root@ZBH:~#
构建pytorch环境
使用pip命令安装pytorch
(base) root@ZBH:~# conda create -n pytorch python==3.12
(base) root@ZBH:~# conda activate pytorch
(pytorch) root@ZBH:~#
(pytorch) root@ZBH:~# pip3 install torch==2.3.0 torchvision==0.18.0 pytorch_triton_rocm==2.3.0 onnxruntime_rocm==1.18.0 tensorflow_rocm==2.16.2 -f https://repo.radeon.com/rocm/manylinux/rocm-rel-6.3.4/
替换动态库
(pytorch) root@ZBH:~# pip show torch | grep Location | awk -F ": " '{print $2}'
/root/miniconda3/envs/pytorch/lib/python3.12/site-packages
(pytorch) root@ZBH:~# cd /root/miniconda3/envs/pytorch/lib/python3.12/site-packages/torch/lib/
(pytorch) root@ZBH:~/miniconda3/envs/pytorch/lib/python3.12/site-packages/torch/lib# rm libhsa-runtime64.so*
(pytorch) root@ZBH:~/miniconda3/envs/pytorch/lib/python3.12/site-packages/torch/lib# cp /opt/rocm/lib/libhsa-runtime64.so.1.14.0 libhsa-runtime64.so
验证一下
进入python交互式界面,尝试引入torch库,然后打印一下cuda的可用情况,如果显示True,那么恭喜你,成功把AMD的GPU当作英伟达的用了,任何模型不用改任何一行代码开箱即用。性能可能没有英伟达的好,但是起码比CPU强个几倍的
(pytorch) root@ZBH:~# python
Python 3.12.0 | packaged by Anaconda, Inc. | (main, Oct 2 2023, 17:29:18) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.cuda.is_available())
True
>>>
解决问题
#Anaconda 的一个小问题
测试导入torch包,但是发现提示GLIBC的问题时
python
Python 3.10.0 (default, Mar 3 2022, 09:58:08) [GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
Traceback (most recent call last):
File "", line 1, in
File "/root/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/init.py", line 237, in
from torch._C import * # noqa: F403
ImportError: /root/miniconda3/envs/pytorch/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by /home/tao/anaconda3/envs/pytorch/lib/python3.10/site-packages/torch/lib/libhsa-runtime64.so)
exit()
-----解决办法-----
因为miniconda或者anaconda下载的stdc++动态库文件跟子系统里面的对不上,很好解决~
使用locate命令先找到自己的Ubuntu系统下的libstdc++动态库文件的位置,然后替换掉我的miniconda里面pytorch的对应动态库即可。
(pytorch) root@ZBH:~# updatedb --prunepaths='/mnt'
(pytorch) root@ZBH:~# apt install plocate
(pytorch) root@ZBH:~# locate libstdc++.so.6
(pytorch) root@ZBH:~# mv /root/miniconda3/envs/pytorch/lib/libstdc++.so.6. /root/miniconda3/envs/pytorch/lib/libstdc++.so.6.bak
(pytorch) root@ZBH:~# mv /root/miniconda3/envs/pytorch/lib/libstdc++.so.6.0.29 /root/miniconda3/envs/pytorch/lib/libstdc++.so.6.0.29.bak
(pytorch) root@ZBH:~# cp /usr/lib/x86_64-linux-gnu/libstdc++.so.6 /root/miniconda3/envs/pytorch/lib/libstdc++.so.6
(pytorch) root@ZBH:~# cp /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.33 /root/miniconda3/envs/pytorch/lib/libstdc++.so.6.0.29
运行示例
#在物理磁盘拉取代码
(pytorch) root@ZBH:~# cd /mnt/g/
(pytorch) root@ZBH:/mnt/g# git clone https://github.com/pytorch/examples.git
Cloning into 'examples'...