I am really curious how the link helped you fix the delay. I have the USB card as default (achieved in another way, by disabling the broadcom module in /boot/config.txt), yet I still have a 5 second delay in AdvMame.
It is close but not perfect. It remains one pixel of fuzz around each block.
Another problem: aspect is wrong. Loop Master for example is 360×224 pixel, and it becomes super-wide screen, where it should render as 4:3.