GPU Acceleration on AMD Graphics Cards for Tensorflow based processes - Forums

Topic starter

Dec 10, 2023

Hi all,

with the increasing popularity of neural network based tools GPU acceleration becomes more and more important because otherwise the processing time for a single opration on high resolution images can easily exceed an hour or more. Currently the most prominant examples of such tools are the nice processes by Russell Croman for PixInsight (BlurXTerminator, StarXTerminator, etc.). In the background tools of that kind often make use of the TensorFlow library which provides various mechanisms for machine learning based calculations. It is already well known that TensorFlow can make use of CUDA enabled GPUs to accelerate such operations, often with a speed increase on the order of magnitudes. Therefore GPUs manufactured by Nvidia are often favoured by those who are investing into hardware for deep sky image processing.

However, what seems often overlooked is the fact that TensorFlow also provides a backend for calculations based on ROCm, the CUDA equivalent of AMD graphics cards. The only problem is that no such prepared version of the library can be simply obtained with a download from the internet.
Nevertheless it is possible to compile the library oneself with support for ROCm enabled. This is exactly what I did and I would like to share my experience of that procedure and the results.
Please do not understand this as comprehensive guide! I only want to demonstrate that Nvidia GPUs are not the only devices capable of such acceleration.

Prerequisites: I am running Ubuntu 22.04 on my desktop machine, equipped with a Ryzen 9 5950x, 128GB of DDR4 memory, and an AMD 6950 XT graphics card.

Outline of the procedure:
1. Install ROCm including development headers (see the amdgpu-install command)
2. Install bazel (for the build process)
3. Clone the official TensorFlow library and check out version 2.14.1 (that's the one I got working)
4. Configure the build, disabling CUDA support, but enabling ROCm support
5. Start a monolithic build (with my processor this took more than an hour)
6. Copy the TensorFlow, as well as the TensorFlow framework library into the PixInsight directory, making sure to set the symbolic links accordingly
7. Make sure that the ROCm version installed is compatible with the version of the amdgpu driver which is running (I initially had a newer Linux kernel running which caused interference between the graphics driver and ROCm)

I recommend setting the environment variable TF_FORCE_GPU_ALLOW_GROWTH to true, otherwise TensorFlow will allocate the entire VRAM causing a lot of stuttering in the GUI.

If everything goes according to plan PixInsight can be launched and no error occurs. Launching StarXTerminator or StarNET++ will now cause the GPU to be used for the calculation, massively speeding up the process:

2023-12-10 21:10:02.503992: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 15810 MB memory: -> device: 0, name: AMD Radeon RX 6950 XT, pci bus id: 0000:11:00.0

A screenshot of my desktop when running StarXTerminator using my AMD GPU:

A comparison between the CPU and GPU based exeuction times for the same image taken with a ZWO ASI6200MM Pro (with Generate Star Image, Unscreen Stars, and Large Overlap enabled):
- CPU (16-core AMD Ryzen 9 5950X): 1h19m58s
- GPU (AMD RX 6950 XT): 3m11s

I think this can be called a significant speed-up

Clear skies,
Philipp

Helpful Insightful Respectful Engaging

lenny

Dec 17, 2023 · Edited: Dec 17, 2023

Hi Philipp,
Thanks for this. Any chance you could elaborate on the steps you executed from step 4-7? A step by step would be a huge benefit

!
What is the proper command for building tensorflow is it "$ bazel build –config=monolithic" from inside the tensorflow directory?

Kind Regards,
Lenny

D. Jung

Dec 17, 2023

That's amazing, been wondering for a while why no one is doing this (I wanted to but am lacking the knowledge). Can this be done in windows as well?

David Cheng

Dec 17, 2023

D. Jung:
That's amazing, been wondering for a while why no one is doing this (I wanted to but am lacking the knowledge). Can this be done in windows as well?

Ｉinstalled DirectML in PixInsight, and can run NoiseXTerminator, but not the others, as the DirectML version is already over one year ago, lacks the support to run the operations in BlurXTerminator, StarXTerminator and StarNet2.

Ped

Dec 17, 2023

With the DirectML I’m able to use the GPU for StarXTerminator and NoiseXTerminator. BlurXTerminator used to work, but with the new update it stopped working.

David Cheng

Dec 17, 2023

With the DirectML I’m able to use the GPU for StarXTerminator and NoiseXTerminator. BlurXTerminator used to work, but with the new update it stopped working.

BlurXterminator 2.0.0 can still run with AI2 but not AI4.

StarXTerminator gives this error message:

*** Error: ERROR: MLLoad() could not load tensorflow graph. Error 3: NodeDef mentions attr ‘explicit_paddings’ not in Op output:T; attr=T:type,allowed=[DT_HALF, DT_BFLOAT16, DT_FLOAT, DT_DOUBLE]; attr=strides:list(int); attr=padding:string,allowed=[“SAME”, “VALID”]; attr=data_format:string,default=”NHWC”,allowed=[“NHWC”, “NCHW”]; attr=dilations:list(int),default=[1, 1, 1, 1]>; NodeDef: {{node starX/add_noise_4/gaussian_filter2d/PartitionedCall/depthwise}}. (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).

I raise a ticket in Github, and hope the development team will consider to work on an update.

Philipp Weber

Topic starter

Dec 17, 2023

D. Jung:
That's amazing, been wondering for a while why no one is doing this (I wanted to but am lacking the knowledge). Can this be done in windows as well?

Hi,
as some have already pointed out, there is the option to use the DirectML plugin based on DirectX 12 to run things on the GPU but as has also been pointed out in this case some things don't really work as expected.
My guess is that also on Windows the more reliable and more performant option would be to run a version of TensorFlow which natively supports one of the GPU variants, CUDA or ROCm.
However, there is a big problem on the horizon, namely what's written on this page which explains the build process on Windows:=note

Note: GPU support on native-Windows is only available for 2.10 or earlier versions, starting in TF 2.11, CUDA build is not supported for Windows. For using TensorFlow GPU on Windows, you will need to build/install TensorFlow in WSL2 or use tensorflow-cpu with TensorFlow-DirectML-Plugin

This is one example where "CUDA" is used synonymously with "GPU" which is a bit unfortunate because it leaves ROCm kind of dangling in the dark, but I would guess that it is also affected by this.
It is very unfortunate, but with that said, there might still be the option to use this exact version 2.10 with native ROCm support on Windows as well at least for the RC Astro tools, as according to this page it is what they have been developed for. Someone needs to sit down and create a TensorFlow dll version 2.10 with ROCm support, this should be possible. I at least couldn't find a version of such a dll ready for download.
I might just try to compile it for myself on Windows but I'm not really used to doing such things on Windows so I'm not sure how this would play out.

Philipp Weber

Topic starter

Dec 17, 2023

lenny:
Hi Philipp,
Thanks for this. Any chance you could elaborate on the steps you executed from step 4-7? A step by step would be a huge benefit !
What is the proper command for building tensorflow is it "$ bazel build --config=monolithic" from inside the tensorflow directory?

Kind Regards,
Lenny

Step 4 was simple. Go into the directory and run ./configure . Here is a copy-paste from my terminal:

You have bazel 6.1.0 installed.
Please specify the location of python. [Default is /usr/bin/python3]:

Found possible Python library paths:
/home/weber/software/Ubuntu-20.04/heasoft/current/x86_64-pc-linux-gnu-libc2.35/lib
/home/weber/software/Ubuntu-20.04/heasoft/current/x86_64-pc-linux-gnu-libc2.35/lib/python
/usr/lib/python3/dist-packages
/usr/local/lib/python3.10/dist-packages
Please input the desired Python library path to use. Default is [/home/weber/software/Ubuntu-20.04/heasoft/current/x86_64-pc-linux-gnu-libc2.35/lib]
/usr/lib/python3/dist-packages
Do you wish to build TensorFlow with ROCm support? [y/N]: y
ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: N
No CUDA support will be enabled for TensorFlow.

Do you want to use Clang to build TensorFlow? [Y/n]: n
GCC will be used to compile TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -Wno-sign-compare]: -O3 -march=native

Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: N
Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
   --config=mkl            # Build with MKL support.
   --config=mkl_aarch64    # Build with oneDNN and Compute Library for the Arm Architecture (ACL).
   --config=monolithic     # Config for mostly static monolithic build.
   --config=numa           # Build with NUMA support.
   --config=dynamic_kernels   # (Experimental) Build kernels into separate shared objects.
   --config=v1             # Build with TensorFlow 1 API instead of TF 2 API.
Preconfigured Bazel build configs to DISABLE default on features:
   --config=nogcp          # Disable GCP support.
   --config=nonccl         # Disable NVIDIA NCCL support.
Configuration finished

For the build in step 5 I just ran

bazel build --config=monolithic //tensorflow:libtensorflow.so

If something goes wrong it sometimes helps to run bazel clean before restarting the build process. And this then took quite a while but produced libtensorflow.so.2.14.1 and libtensorflow_framework.so.2.14.1 in bazel-bin/tensorflow/ . These are the files we want and need to be put into the PixInsight directory, which is step 6. At the time of writing this post the PixInsight directory contains the files libtensorflow.so.2.11.0 and libtensorflow_framework.so.2.11.0. I recommend creating a backup of the entire PixInsight directory before changing anything. These two files can in principle stay but removing them also doesn't hurt if you have the backup. Afterwards the symbolic links libtensorflow.so.2 and libtensorflow_framework.so.2 need to be deleted and newly created but pointing to the new libraries we just created. Something like this should work, from within the directory:

rm libtensorflow.so.2
ln -s libtensorflow.so.2.14.1 libtensorflow.so.2
rm libtensorflow_framework.so.2
ln -s libtensorflow_framework.so.2.14.1 libtensorflow_framework.so.2

of course after copying the two new files there. Depending on your installation you might need root privileges to change anything in this directory. And this should be it. Restart PI and the running StarNET++ or any of the RC Astro tools should utilize the GPU.

Step 7 is only relevant if you fiddled around with different kernel version, but if you're using a more or less vanilla installation of Ubuntu it shouldn't be a problem.

In case you're running PixInsight on Ubuntu 22.04 I can just send you the compiled libraries. You still need to install the ROCm libraries yourself though.

lenny

Dec 17, 2023

Thanks a million for that Philip,I'm running Ubuntu 20.04.6 LTS at the moment. The biggest issue I have is I cant seem to get over "no such package '@local_config_rocm//rocm': Repository command failed" I have Rocm 5.1 and 6.0.0 installed so i might just have to purge and retry from scratch.
Been going around in circles for hours

Thanks for getting back to me. That extra detail is a big help.

Philipp Weber

Topic starter

Dec 17, 2023

lenny:
Thanks a million for that Philip,I'm running Ubuntu 20.04.6 LTS at the moment. The biggest issue I have is I cant seem to get over "no such package '@local_config_rocm//rocm': Repository command failed" I have Rocm 5.1 and 6.0.0 installed so i might just have to purge and retry from scratch.
Been going around in circles for hours

Thanks for getting back to me. That extra detail is a big help.

Ah, wait, there is one more detail I forgot, sorry! (As I wrote, this is not intended to be a complete guide)
You need to point the environment variable ROCM_PATH to the actual installation directory of ROCm. In my case this was /opt/rocm-5.7.0/, so

export ROCM_PATH=/opt/rocm-5.7.0/

This also implies that I can only state the the compilation of TF version 2.14.1 succedded for my Ubuntu 22.04 in combination with ROCm 5.7.0

It took me something like 3 days to finally get everything to work, so it's not what you'd call a pleasent experience. Something precompiled would certainly be nice ...

Philipp Weber

Topic starter

Dec 17, 2023

(You can't rely on the symlink /opt/rocm -> /etc/alternatives/rocm/ for some reason related to bazel)

lenny

Dec 18, 2023

Philipp Weber:
lenny:
Thanks a million for that Philip,I'm running Ubuntu 20.04.6 LTS at the moment. The biggest issue I have is I cant seem to get over "no such package '@local_config_rocm//rocm': Repository command failed" I have Rocm 5.1 and 6.0.0 installed so i might just have to purge and retry from scratch.
Been going around in circles for hours

Thanks for getting back to me. That extra detail is a big help.

...OK i still cannot get it to recognise ROCM path, I updated to 22.04 today and unistalled -reinsalled ROCm no joy...... Im going to cave and humbly beg you to send me the pre-compiled files

D. Jung

Dec 29, 2023 · Edited: Dec 29, 2023

After a week in Linux cyclic dependency hell, i managed to compile libtensorflow.2.16.0. However, when using the lib, PixInsight crashes with some random error when initializing the NN. Maybe related to my graphics card (rx6600) or me compiling the wrong version. Maybe someone is willing to share his confirmed working libtensorflow version? If anyone wants to try my libs, just ping me.

Philipp Weber

Topic starter

Jan 10, 2024

D. Jung:
After a week in Linux cyclic dependency hell, i managed to compile libtensorflow.2.16.0. However, when using the lib, PixInsight crashes with some random error when initializing the NN. Maybe related to my graphics card (rx6600) or me compiling the wrong version. Maybe someone is willing to share his confirmed working libtensorflow version? If anyone wants to try my libs, just ping me.

Hi,
already sent you a PM

But for anyone else, a tarball with the current files I use can be found here: http://www.sternwarte.uni-erlangen.de/~weber/tensorflow.tar.gz
Please keep in mind that I'm still on experimental grounds here

D. Jung

Jan 10, 2024

Philipp Weber:
D. Jung:
After a week in Linux cyclic dependency hell, i managed to compile libtensorflow.2.16.0. However, when using the lib, PixInsight crashes with some random error when initializing the NN. Maybe related to my graphics card (rx6600) or me compiling the wrong version. Maybe someone is willing to share his confirmed working libtensorflow version? If anyone wants to try my libs, just ping me.

Hi,
already sent you a PM
But for anyone else, a tarball with the current files I use can be found here: http://www.sternwarte.uni-erlangen.de/~weber/tensorflow.tar.gz
Please keep in mind that I'm still on experimental grounds here

Thanks for sharing @Philipp Weber . Those libs work for me. Cut down processing time for BlurX from ~3min to ~20secs

Helpful

Philipp Weber

Topic starter

Jan 10, 2024

D. Jung:
Philipp Weber:
D. Jung:
After a week in Linux cyclic dependency hell, i managed to compile libtensorflow.2.16.0. However, when using the lib, PixInsight crashes with some random error when initializing the NN. Maybe related to my graphics card (rx6600) or me compiling the wrong version. Maybe someone is willing to share his confirmed working libtensorflow version? If anyone wants to try my libs, just ping me.

Hi,
already sent you a PM
But for anyone else, a tarball with the current files I use can be found here: http://www.sternwarte.uni-erlangen.de/~weber/tensorflow.tar.gz
Please keep in mind that I'm still on experimental grounds here

Thanks for sharing @Philipp Weber . Those libs work for me. Cut down processing time for BlurX from ~3min to ~20secs

That's awesome, so nice to hear it!

pessorrusso

Feb 22, 2024 · Edited: Feb 23, 2024

In my case I had a lot of issues. My GPU was a RX 6600 XT I have chosen because it is supposed to be driver friendly for linux.
The problem is that this GPU is not officially supported by rocm but you can set some environment variables and you can use ROCM with no issue (with python, but not clang). The problem was that the ROCM version which would support this workaround was not fully compatible with the tensorflow main branch, so I had to use the tensorflow from the rocm branch. BUT that tensorflow version (future 2.16 I think) has a different API which is not compatible with what PI/[thing]Xterminator is expecting.

After one week of troubleshooting, changing old rocm and tensorflow source code versions by myself I got frustrated and just moved to nvidia, worked like a charm in my first try and I could install their drivers in linux with absolutely no issue at all.

Just sharing my experience, probably you would not have these issues if you start with a radeon gpu that is officially supported by rocm.

Helpful

Tomvp

Feb 23, 2024 · Edited: Feb 23, 2024

Maybe slightly off-topic, but you can get an Intel Mac with an eGPU (always AMD). Runs all Xterminator SW straight out of the box, no drivers required.

D. Jung

Mar 17, 2024

After updating PixInsight, the rocm libraries stopped working for no apparent reason. This is very frustrating and I gave up trying to find out what's wrong...

I pondered buying another graphics card just for PI, but then decided to update the PC itself, swapping the i7-6500 for an i5-14600k pushing the PI benchmark score from 2k to 18k.

blurx now takes ~40s instead of 3mins (was 20s with rocm on the GPU).

Using ramdisk, noisex runs even faster than on the GPU due to the long initialization time of the rocm libraries (7s versus 10s).

On a side note, I also tried to use zluda, which enables running cuda applications on windows with just 3 extra files, but it's of course not working with PI

Helpful Insightful Engaging

pessorrusso

Mar 17, 2024

D. Jung:
After updating PixInsight, the rocm libraries stopped working for no apparent reason. This is very frustrating and I gave up trying to find out what's wrong...

every time you update PixInsight the tensorflow library will be replaced by the one from the PI setup (which uses CPU). You need to replace those with the tensorflow version using GPU.

Helpful Concise

D. Jung

Mar 17, 2024

D. Jung:
After updating PixInsight, the rocm libraries stopped working for no apparent reason. This is very frustrating and I gave up trying to find out what's wrong...

every time you update PixInsight the tensorflow library will be replaced by the one from the PI setup (which uses CPU). You need to replace those with the tensorflow version using GPU.

That's what I did. The original tensorflow libraries don't even load with a Radeon GPU. I copied over the same rocm libraries I used with the older pi version. Then it complained about incompatible something. I even downgraded pi again and it didn't work anymore with the exact same libraries it worked before...

Philipp Weber

Topic starter

Mar 18, 2024

D. Jung:
D. Jung:
After updating PixInsight, the rocm libraries stopped working for no apparent reason. This is very frustrating and I gave up trying to find out what's wrong...

every time you update PixInsight the tensorflow library will be replaced by the one from the PI setup (which uses CPU). You need to replace those with the tensorflow version using GPU.

That's what I did. The original tensorflow libraries don't even load with a Radeon GPU. I copied over the same rocm libraries I used with the older pi version. Then it complained about incompatible something. I even downgraded pi again and it didn't work anymore with the exact same libraries it worked before...

I'm a bit puzzled as I'm still not having any problems. I'm running PixInsight 1.8.9-2, build 1588 on Ubuntu 22.04, and rocm 5.7.0

Well Written

Jakob Sahner

Mar 28, 2024

Philipp Weber:
D. Jung:
After a week in Linux cyclic dependency hell, i managed to compile libtensorflow.2.16.0. However, when using the lib, PixInsight crashes with some random error when initializing the NN. Maybe related to my graphics card (rx6600) or me compiling the wrong version. Maybe someone is willing to share his confirmed working libtensorflow version? If anyone wants to try my libs, just ping me.

Hi,
already sent you a PM
But for anyone else, a tarball with the current files I use can be found here: http://www.sternwarte.uni-erlangen.de/~weber/tensorflow.tar.gz
Please keep in mind that I'm still on experimental grounds here

Hey Guys!
I just upgraded to an AMD GPU (7900xt) because I dont see a point in speding houndrets of €'s more to get the same raw performance from NVIDIA...

I knew the challenge or lack of support for the AMD cards before upgrading, but I hoped it will work in the future, be it nativly or with a selfmade solution.
Im also not invested into coding or self compiling files and experimenting to get it to work.

As far as I understand I could just use the 2 files from the download link here to get it to work? For that I have some questions:

1. Does it work like that

?

2. On the NVIDIA side I was used to replace the tensorflow.DLL<--- file, but now I have just files. Do I have to change them in any way? If yes how and is it easy for guys like me that arent experienced to do this?

3. When I have the two files do I have to copy them into the Pixinsight >bin< folder?

4. I mainly read the word Linux here. Do you think this will work in Windows?

Thanks in advance for any help!
Clear Skies
Jakob

D. Jung

Mar 28, 2024

This does not work on windows (yet):

You need to install rocm 5.7.

add a bunch of environment variables.

Symlink the 2 files replacing the 2 in the PixInsight lib folder.

Pray to whatever deity is your favorite and hope it works.

Philipp Weber

Topic starter

Mar 28, 2024

Jakob Sahner:
Philipp Weber:
D. Jung:
After a week in Linux cyclic dependency hell, i managed to compile libtensorflow.2.16.0. However, when using the lib, PixInsight crashes with some random error when initializing the NN. Maybe related to my graphics card (rx6600) or me compiling the wrong version. Maybe someone is willing to share his confirmed working libtensorflow version? If anyone wants to try my libs, just ping me.

Hi,
already sent you a PM
But for anyone else, a tarball with the current files I use can be found here: http://www.sternwarte.uni-erlangen.de/~weber/tensorflow.tar.gz
Please keep in mind that I'm still on experimental grounds here

Hey Guys!
I just upgraded to an AMD GPU (7900xt) because I dont see a point in speding houndrets of €'s more to get the same raw performance from NVIDIA...

I knew the challenge or lack of support for the AMD cards before upgrading, but I hoped it will work in the future, be it nativly or with a selfmade solution.
Im also not invested into coding or self compiling files and experimenting to get it to work.

As far as I understand I could just use the 2 files from the download link here to get it to work? For that I have some questions:

1. Does it work like that ?

2. On the NVIDIA side I was used to replace the tensorflow.DLL<--- file, but now I have just files. Do I have to change them in any way? If yes how and is it easy for guys like me that arent experienced to do this?

3. When I have the two files do I have to copy them into the Pixinsight >bin< folder?

4. I mainly read the word Linux here. Do you think this will work in Windows?

Thanks in advance for any help!
Clear Skies
Jakob

Hi Jakob,

unfortunately your doubts are well-grounded. There are several issues with the method I used regarding the Windows operating system.

First of all, the TF library I compiled specifically for Linux does indeed only work for Linux, and probably only for specific versions of Linux (Ubuntu 22.04 in my case). This is a fundamental problem to almost all pieces of compiled software and part of the reason why Linux is often left behind by large and small software companies.

However, there is even more bad news. Quoting the TF site here:=note

Note: GPU support on native-Windows is only available for 2.10 or earlier versions, starting in TF 2.11, CUDA build is not supported for Windows. For using TensorFlow GPU on Windows, you will need to build/install TensorFlow in WSL2 or use tensorflow-cpu with TensorFlow-DirectML-Plugin

So current versions of TF can't even be built on Windows natively. (They often use "CUDA" synonymously with "GPU")

But there is a hint in this statement. I have no experience whatsoever with Windows in that regard, but you could try using this DirectML Plugin to run TF on the GPU in your Windows system.

Ciao,
Philipp

Helpful Insightful Respectful