To debug my bare metal code, I found that I could use qemu with the following command, but only with a simple modification to qemu.
qemu-system-aarch64 -m 1024 -cpu cortex-a53 -machine raspi3 -smp 4 -bios kernel8.img
Without modifying qemu, my code, the "bios", gets loaded at 0x80000; there's no option to change this, but editing the definition of FIRMWARE_ADDR_3 in qemu-3.0.0/hw/arm/raspi.c and re-compiling works.
Here is a very hacky way to follow ARM code running in qemu using gdb on qemu (built with debug enabled, naturally):
Set a breakpoint:
break get_phys_addr_lpae
There, set a watchpoint on the (location of the) pc:
watch -l env->regs[15]
or
watch -l env->pc
At the watchpoints, you can use the "up" command to get to a function where "env" is declared and
print /x env->regs
I'm sure there are better ways, but it works. (If you want to follow each instruction, start the program with the -singlestep option, otherwise the watchpoint will only trigger at every few instructions.