Home Making an AMDGPU debugger part II - The Devk
Post
Cancel

Making an AMDGPU debugger part II - The Devk

Intro

After Part I, the plan was formed, but there is an issue - stopping a wave on the graphics ring will hang the GPU. This in turn will also hang the desktop on Linux currently, so it is not feasible to have the debugger running on the same computer (unless perhaps a second GPU is used or the debugged shader is a compute shader running on a compute ring). This might seem like a wrench in the works, but fortunately, I have recently acquired the thing that I think is currently the ultimate development kit: the Steam Deck.

Why is the Deck such a nice option?

  • Recent AMD GPU (Van Gogh / gfx10.3 / RDNA2)
  • Comes preinstalled with Linux, easily remotable
  • Blazing fast reboot
  • Compact enough that it fits below your monitor!

After just a bit of setup, we can get a good development experience going, with fast reboots, which is important if we want to develop through many GPU hangs (and boy, did I have a few!).

Unimportant meme about rebooting the Deck This post is really dry, so I added some memes.

In this post I want to share my tips on turning the Deck into the Devk, which is hopefully useful to some, but ultimately skippable if you are only interested in the GPU bits.

Remote setup

For remote development, we can use ssh. For me just using a terminal is a bit too puritanical, so I would recommend setting up VSCode for this purpose (or a similar remote dev environment). It is also recommend to set up ssh keys to avoid having to do password authentication each time.

You will also need to set up your Deck for development1:

  1. create a password for the deck user (passwd)
  2. disable the read-only filesystem (sudo steamos-readonly disable)
  3. initialise pacman keyring (sudo pacman-key --init)
  4. populate the pacman keyring with the default Arch Linux keys (sudo pacman-key --populate archlinux)
  5. optional, but recommended step change the boot DE to be plasma (desktop mode), instead of gamescope2:
1
2
3
[Autologin]
-Session=gamescope-wayland.desktop
+Session=plasma.desktop

Note that upgrading your OS will undo some of these steps, including removing packages.

Mesa development environment

For getting up to speed with mesa building, follow the instructions here - thanks to @pixelcluster for setting me up with the instructions for this bit! Unfortunately, it seems like the Deck ships with some development packages installed, but with the headers not in place - I recommend reinstalling all the packages that are already installed.

Remember to set up mesa-run.sh from the instructions and then you can just do mesa-run.sh target_pgm. I also set up a handy mesa-run-gdb.sh, which just passes the arguments to gdb instead:

1
2
3
4
5
6
MESA=$HOME/mesa_dbg \
LD_LIBRARY_PATH=$MESA/lib64:$MESA/lib:$LD_LIBRARY_PATH \
LIBGL_DRIVERS_PATH=$MESA/lib64/dri:$MESA/lib/dri \
VK_ICD_FILENAMES=$MESA/share/vulkan/icd.d/radeon_icd.x86_64.json \
D3D_MODULE_PATH=$MESA/lib64/d3d/d3dadapter9.so.1:$MESA/lib/d3d/d3dadapter9.so.1 \
gdb "$@"

Increasing the amdgpu lockup_timeout

When we submit some work to the GPU and the KMD waits for it, there is a timeout after which the GPU is considered hung. When this happens the KMD will attempt to reset the device. During my experimentation, the KMD usually did not successfully reset, so once the timeout is hit, it is better to reboot.

But if we want to controllably hang the GPU, we need to raise this timeout to avoid the device being reset below our feet. To do this, we need to add a new timeout value to kernel cmdline. The key-value pair of amdgpu.lockup_timeout=<timeout> needs to be added to /boot/efi/EFI/steamos/grub.cfg - you can do this by editing /etc/default/grub and running grub-mkconfig. The default timeout for graphics is 10000, which is in milliseconds.

1
GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3 quiet splash plymouth.ignore-serial-consoles tsc=directsync module_blacklist=tpm log_buf_len=4M amd_iommu=off amdgpu.gttsize=8128 spi_amd.speed_dev=1 audit=0 fbcon=vc:4-6 fbcon=rotate:1 amdgpu.lockup_timeout=100000"

A Screen Darkly

A problem we face now is that we want to run programs on Deck locally, but we don’t want to actually touch the Deck itself. An easy way of accomplishing this is using the utility screen, which allows running a local terminal session, that can be interacted with remotely. Create a script on desktop with:

1
xterm -hold -e "screen -xR ff"

And then tapping on this shortcut will open a terminal with a screen session running named ff. You can also make this start automatically of course - or just shut the Deck down cleanly once for it to start it back up automatically.

Once you have the screen session ongoing, you can just write to that session by using the screen commmand stuff. For example for running our target app:

1
screen -r ff -x -X stuff "mesa-run.sh ./target_app\n"

Note the \n - this is because we really are just remotely typing into the local shell. And then we can use the same idea to kill the app by “pressing” Ctrl-C:

1
screen -r ff -x -X stuff "^C"

Compiling AMD ISA

For setting up the proof-of-concept debugger, we will use a trap handler written in AMD ISA. This has some drawbacks compared to aco compiling it for us, but it is faster to prototype. To compile AMD ISA, we set up a small convenience script that will use clang+llvm to compile, then pull out the .text section of the ELF file into a binary file called asmc.hex3.

1
2
3
clang -c -x assembler -target amdgcn-amd-amdhsa -mcpu=gfx1030 -o /tmp/asm.o $1
objdump -h asm.o | grep .text | awk '{print "dd if='asm.o' of='asmc.hex' bs=1 count=$[0x" $3 "] skip=$[0x" $6 "] status=none"}' | bash
rm /tmp/asm.o

Unimportant meme about writing ISA The only caveat to this section.

To check if everything went well, it is possible to dump the resulting binary file and disassemble it using llvm-mc:

1
hexdump -v -e '/1 "0x%02X "' $1 | llvm-mc -arch=amdgcn -mcpu=gfx1030 -disassemble

Conclusion

With Devk set up, in the next post we get into the nitty-gritty, and figure out how to make trap handlers work!

Footnotes

This post is licensed under CC BY 4.0 by the author.