Forum Replies Created

Viewing 11 posts - 1 through 11 (of 11 total)
  • Author
    Posts
  • brunnis
    Participant
    Post count: 11

    I agree that there’s further tests to be done, but I’m not sure I can spend much more time on this. It was sort of disheartening to learn that not even a powerful PC running RetroArch under KMS can provide any meaningful difference in input lag.

    I will probably perform some additional testing on Windows 10, since that’s slightly less cumbersome for me to do. Any interesting findings will then be carried over to my RetroPie setup for testing (if they apply).

    Since general interest here seems low, please see the Libretro forums thread for further discussion on the matter: http://libretro.com/forums/showthread.php?t=5428

    brunnis
    Participant
    Post count: 11

    Okay guys, I’ve done a lot of testing since I last wrote. I’ve focused on performing a slightly more in-depth comparison of RetroPie on the Raspberry Pi 3 compared to running RetroArch on the PC, to sort of compare to the best case scenario.

    First off, the hardware specs:

    [b]Windows PC:[/b]

    Core i7-6700K (Skylake)
    Radeon R9 390 8GB
    Windows 10 64-bit

    [b]Linux PC:[/b]

    Dell Latitude E5450
    Core i5-5300U (Broadwell)
    Integrated HD Graphics 5500
    Ubuntu 15.10 64-bit

    [b]Monitor (used for all tests):[/b] HP Z24i LCD monitor with 1920×1200 resolution. This monitor supposedly has almost no input lag (~1 ms), but I’ve only seen one test and I haven’t been able to verify this myself.

    [b]Gamepad (used for all tests):[/b] CIRKA USB SNES replica

    I tested input lag in NES and SNES emulators and used the following two games:
    [ul]
    [li]Mega Man 2[/li]
    [li]Super Mario World 2: Yoshi’s Island[/li]
    [/ul]
    [b]Test methodology:[/b]

    I filmed the monitor and gamepad with a Canon EOS 70D in 1280×720 mode at 60 FPS. I filmed while jumping repeatedly (approximately 30 times for each test) and then analyzed the film clips frame by frame to get the average input lag.

    [b]Results:[/b]

    [b]Raspberry Pi 3 + RetroPie 3.6[/b]

    FCEUmm: 7 frames
    Nestopia: 6 frames
    snes9x-next: 8 frames

    [b]Comments:[/b] Nestopia was pretty consistently 1 frame quicker than FCEUmm.

    [b]Windows 10 + RetroArch 1.3.2[/b]

    video_hard_sync off:

    Nestopia: 7 (often 8) frames
    snes9x-next: 9 (often 10) frames

    video_hard_sync on:

    Nestopia: 4 frames
    snes9x-next: 6 (often 7) frames
    bsnes-mercury-balanced: 6 (often 5) frames

    [b]Comments:[/b] Xbox Game DVR feature was disabled in Windows’ Xbox application. Having this feature enabled has been reported to add input lag, but I didn’t really investigate it.

    [b]Ubuntu 15.10 + RetroArch 1.3.3 in KMS mode[/b]

    Nestopia: 5 frames
    bsnes-mercury-balanced: 7 frames

    [b]Comments:[/b] video_hard_sync was left off. Enabling it had no effect on performance.

    [b]Conclusions:[/b] There are a few conclusions we can draw from these tests. First of all, Nestopia was pretty consistently 1 frame quicker than FCEUmm. SNES emulation seems to be approximately 2 frames slower than NES emulation with Nestopia (at least for the tested games). However, testing on Windows suggests that bsnes-mercury-balanced is quicker than snes9x-next. Where snes9x-next is around 2-3 frames slower than Nestopia, bsnes-mercury-balanced is around 1-2 frames slower. Unfortunately bsnes-mercury is not available on RetroPie (and probably wouldn’t run very well due to its higher requirements).

    Regarding platform differences: RetroPie is 1-2 frames faster than Windows 10 without video_hard_sync enabled. Once video_hard_sync is enabled, Windows 10 shaves off at least 3 frames of input lag for both NES and SNES emulation, providing the quickest response of all three tested platforms. The biggest surprise is perhaps that running RetroArch in Linux under KMS still doesn’t beat Windows in terms of input lag.

    So, to summarize, the final standings are:

    1. RetroArch under Windows 10: 4-6 frames (67-100 ms) of input lag
    2. RetroArch under Linux KMS: 5-7 frames (83-117 ms) of input lag
    3. RetroPie: 6-8 frames (100-133 ms) of input lag

    Although Windows is the quickest, I still wouldn’t call it lightning fast, especially not in the SNES case. Adding a couple of frames for the average TV and you’re already at 133 ms input lag. That may be perfectly fine for modern shooters, but it’s not ideal for super-fast platformers. My Samsung plasma TV appears to have around 4 frames of input lag, which combined with RetroPie results in a total of 200 ms for SNES emulation. That is definitely noticeable and makes, for example, Super Mario World a lot harder than it used to be.

    One interesting development that may change things slightly for the Raspberry Pi is the ongoing development of a fully open source OpenGL graphics stack. Maybe someone more knowledgeable than me can chime in on that and shed some light on any possible ramifications? Given the fact that a PC running Linux under KMS can’t shave off more than 1 frame of input lag compared to the current version of RetroPie, I’m not overly optimistic.

    [b]Disclaimer:[/b]

    – There’s of course some uncertainty in the measurements due to recording at just 60 FPS. However, with around 30 attempts for each test, a pretty clear trend could be seen.
    – It would have been interesting to test with more NES and SNES games, but I didn’t have time for that.
    – I can’t guarantee that the HP Z24i doesn’t add some measurable latency.
    – Different display hardware (AMD, Intel, Nvidia) could very well affect the outcome.

    brunnis
    Participant
    Post count: 11

    [quote=119865]video_hard_sync is not used on the rpi[/quote]
    Kind of thought so.

    By the way, did one additional test:

    [b]gl + vsync = false:[/b] 9 frames

    Since my plasma seems to have 4 frames input lag (although that’s not confirmed), that would lead to 5 frames latency (or approximately 80 ms) for RetroPie. However, playing without vsync is curiously stuttery on this system.

    [quote=119874]there’s some tips here: https://www.reddit.com/r/emulation/comments/41okgr/can_someone_help_me_reduce_input_lag_for_retroarch/ (user libretro should know what they’re talking about!)[/quote]
    Yep, I found that as well! Some interesting stuff there.

    [quote=119874]it’s also worth experimenting with the input drivers. i think it defaults to udev, but linuxraw might be faster? try the others also.

    there is also a new input polling option in settings > input called “poll type behavior” that may help when set to “late” vs “early” or “normal”.

    it’s a lot of experimentation but i’d love a sort of comprehensive list of what does/doesn’t help :) would make a good wiki page…[/quote]
    Yep, I’ll have a look at those options as well. I’ll hopefully have time for it this weekend.

    brunnis
    Participant
    Post count: 11

    Okay, the results are in! Tested on my Samsung plasma, since the much better pixel response time makes it easier to spot when a change is happening on the screen. Tested with Yoshi’s Island again and filmed in 60 FPS with a Canon EOS 70D.

    [b]gl driver:[/b] 12 frames

    [b]dispmanx:[/b] 11 frames

    [b]dispmanx + video_hard_sync = true:[/b] 11 frames

    I’d say the dispmanx driver was rather consistently 1 frame faster, but it’s still close enough that I wouldn’t bet my life on there being any real difference.

    Disappointing results, of course. I wonder what I should test next…

    brunnis
    Participant
    Post count: 11

    Funny you should mention it… That’s just what I was planning on doing tonight when I get home from work. :) I will also be using a Canon EOS 70D which is capable of 60 FPS recording, so results should be a bit more accurate.

    I’ve read about dispmanx previously, but for some reason I forgot to test it yesterday.

    EDIT: Might have to postpone testing another day or so due to family stuff coming inbetween, but I will do the test as soon as possible.

    brunnis
    Participant
    Post count: 11

    Just thought I’d add some observations here as well. I notice input lag in RetroPie as well. It’s not a big issue when I use it on a computer monitor, but it is an issue for some types of games when I use it on my Samsung plasma.

    I used a very crude method of getting an idea of the input lag. I only had a 30 fps camera available and I used that to film while I performed quick taps on the controller and mouse I used for testing. I’ve taken the liberty of multiplying the number of frames I counted in my recorded material by two, since that better corresponds to the actual number of frames rendered by the source (since it runs at 60 FPS).

    The PCs were tested at the Windows desktop, recording the response from the built-in game controller configuration program and from double-clicking to select text in Notepad. [Granted, that’s a bit apples and oranges, so I’ve added some actual RetroArch measurements in an edit further down.]

    [b]Hardware & software[/b]

    – Raspberry Pi 3 with RetroPie 3.6 and Super Mario World 2: Yoshi’s Island
    – Core i7-6700K with Windows 10
    – Core i5-5300U with Windows 7
    – Samsung plasma TV
    – HP Z24i desktop monitor (24″, 1920×1200, IPS)

    [b]Results[/b]

    RetroPie + Samsung plasma TV: [b]12 frames (200 ms)[/b]

    RetroPie + HP Z24i: [b]8 frames (133 ms)[/b]

    Windows 10 + HP Z24i: [b]4 frames (67 ms)[/b]

    Windows 7 + HP Z24i: [b]2 frames (33 ms)[/b]

    I know of one test that has measured the HP monitor’s input lag and found it to be almost non-existent (less than 1 ms). That would put the plasma at around 60-70 ms, which sounds plausible. I tested many times and these results were quite consistent.

    So, RetroPie on the HP monitor responds approximately 100 ms slower than the Windows 7 laptop using the same HP monitor. That’s a very significant difference, but the Windows 7 laptop also wasn’t running an emulator.

    I love RetroPie, but this issue (which, granted, may have nothing to do with RetroPie itself) kind of kills it for me. A lot of the games I play/want to play on it are fast paced and rely on exact timing to succeed. And I have definitely noticed that they’re harder because of this.

    Is there any work being done on attempting to profile this and come up with a solution?

    [b]EDIT:[/b]

    Did some more testing with the Windows 7 machine, using RetroArch and Yoshi’s Island:

    [b]8 frames (133 ms)[/b] with Snes9x Next and video_hard_sync = false
    [b]6 frames (100 ms)[/b] with Snes9x Next and video_hard_sync = true
    [b]4-6 frames (67-100 ms)[/b] with bsnes (balanced) and video_hard_sync = true

    video_hard_sync did not seem to have any effect on RetroPie, but it did seem to shave off a couple of frames in the Windows case.

    What’s interesting here is that I could not come close to reproduce the input reponse from the Windows desktop (2 frames). Input response time (with video_hard_sync disabled) was actually the same as using RetroPie. Does the emulator really need that much time to output to the screen?

    [b]EDIT2:[/b]

    I just tested the input lag when typing at the command line in RetroPie. A typed character shows up in the next recorded frame, which means that input lag is 33 ms at the most.

    So, the Raspberry Pi and the PCs have similar low latency input when just reacting on USB input (such as writing on a keyboard). All devices also seem similarly slow to react on input when running the SNES emulator, although video_hard_sync = true seems to help the PCs slightly.

    brunnis
    Participant
    Post count: 11

    [quote=118555]Well on a non-overclocked system it is a 30% processor increase.[/quote]
    No, it’s actually roughly a 60% increase. If it was just a clock frequency increase, you’d be right. However, the Pi 3 uses a different and improved CPU core (Cortex-A53), which by itself improves performance by roughly 25%. In some cases (and with recompiled code), the performance improvement should be even larger.

    brunnis
    Participant
    Post count: 11

    [quote=118514]https://www.element14.com/community/community/raspberry-pi/blog/2016/02/29/the-most-comprehensive-raspberry-pi-comparison-benchmark-ever
    [/quote]
    Very few of his comments regarding the speed improvement of the Pi 3 vs the Pi 2 seem to add up. His tables say one thing and his comments say another. How about this, taken from the article’s MemTester section:

    2 Model B 23 minutes 39.07 seconds
    3 Model B 8 minutes 37.078 seconds

    “In this case the Pi 3 is more than 50% faster than the Pi 2 at allocating and accessing RAM.”

    While technically true, the actual performance improvement is 174%. :-P

    brunnis
    Participant
    Post count: 11

    Official info: https://www.raspberrypi.org/blog/raspberry-pi-3-on-sale/

    Price is the same as the old one. 1.2GHz quadcore Cortex-A53 with 64-bit support, as speculated.

    brunnis
    Participant
    Post count: 11

    Well, the rumor ended up true. I just ordered my Pi 3! Should be arriving tomorrow. :-)

    No confirmation yet on the exact CPU core, but it is a 1.2GHz quad core as rumored.

    brunnis
    Participant
    Post count: 11

    My best guess would be that they’re using a quadcore Cortex-A53 cluster. This should provide a rather sizeable performance increase over the Pi 2. The A53 is at least 30% faster at the same frequency compared to the A7. Coupled with a 33% frequency increase, this would yield more than 70% better single thread performance. :)

Viewing 11 posts - 1 through 11 (of 11 total)