How do TomsHardware execute the game benchmarks?

Take for instance this chart of BattleField 3 GPU benchmarks - how do they get the numbers? I guess only guys from Toms Hardware can answer this, or maybe they have described their methods in some post earlier that I've missed.

I know you can use Fraps and get some semi-accurate numbers, but these will differ for each run-through as there are a million of variables that will affect the rendered graphics, even if you do "the same" actions every time (human movement differs, AI decisions differs, etc). This was briefly discussed in this stackexchange thread I posted to yesterday.

Anyway, is there a way to get 100% consistent benchmark numbers of FPS in BF3 or other games that does not have in-game record/playback functions?
  1. This is indeed interesting, but needs elaboration

    We achieve consistent results by calculating the average of three scripted runs

    How do one execute scripted runs? What tools are used?
