Presenting game benchmark results

Often the performance of a hardware configuration is evaluated using game benchmarks, and usually "average frames per second" (fps) numbers are presented for that. However, game players care for some frames more than for others; in particular, they care more for the absence of slow frames than for speeding up already-fast frames. Therefore, some people show the minimal frame rate instead of the average frame rate. However, that reflects just the time for a single frame, which may not be very representative of the game experience, either.

What the players are probably more interested in is: What percentage of the time is the frame rate above the minimal desired frame rate? What is the frame rate that is exceeded 50% or 95% of the time?

These questions can be answered by recording the times for the individual frames (in Windows 2K/XP with fraps), then sorting the frame times, and drawing a graph of the result, like the sorted graph below. Just look at where the line exceeds the minimal desired framerate, and where it is after 50% or after 5% of the time.

This graph shows two runs of the same timedemo (ONS-Dria from nVnews) on the same machine (using a Core 2 Duo E6600 and a Radeon X850XT); as you can see, the two graphs are almost identical, showing that this way of presenting the results is relatively immune to variations between runs.

What this graph does not show you is whether the slow frames occur all together or are distributed throughout the run. For this you can use a simple frametimes graph that just shows the frames in the original order (instead of sorted). Or alternatively one could produce a graph that shows the average or median across 3 or 5 frames, or across, say, 100ms.

In order to see both kinds of information in a single graph, you can draw the unsorted and the sorted line in a combined graph (to avoid clutter this graph only shows data from the red run).

References

Madkiller from 3DCenter has written a number of articles in German on this topic and similar topics: Average fps - wirklich das Maß aller Dinge?; Timedemos - Das Maß aller Dinge?; Timedemos - Das Maß aller Dinge? Part 2. One important point that he made that is beyond the scope of this page is that timedemos are often not representative of real gaming in the amount of CPU load they generate.
Anton Ertl
[ICO]NameLast modifiedSizeDescription

[DIR]Parent Directory  -  
[   ]combined.eps07-Jan-2007 14:47 37K 
[   ]combined.pdf07-Jan-2007 14:53 35K 
[   ]frametimes.eps06-Jan-2007 23:38 37K 
[   ]frametimes.pdf06-Jan-2007 23:17 37K 
[   ]sorted.eps06-Jan-2007 23:16 37K 
[   ]sorted.pdf06-Jan-2007 23:17 33K 

Apache/2.2.22 (Debian) DAV/2 mod_fcgid/2.3.6 PHP/5.4.36-0+deb7u3 mod_python/3.3.1 Python/2.7.3 mod_ssl/2.2.22 OpenSSL/1.0.1e mod_perl/2.0.7 Perl/v5.14.2 Server at www.complang.tuwien.ac.at Port 80