Test Suite And Methodology
We had to temporarily replace Dromaeo DOM with Acid3 due to the former having issues with the WebKit-based browsers under Windows.
Another CSS test was added to balance out the Maze Solver CSS3 benchmark.
The Mozilla Hardware Acceleration Stress Test was replaced with WebVizBench, which has no highest score limit.
We've added a general responsiveness observation to the 40-Tab Memory Usage test.
And finally, we found a security test that is still relevant and not passed by any of the contenders.
Our tests are no longer placed into the core, observation, dated, and quarantine groups. With the massive refresh to the benchmark suite and the introduction of composite scores to cover every category of testing, this is simply no longer necessary.
Web Browser Grand Prix Test Suite v11
The table below lists all 34 benchmarks (consisting of 66 individual tests) currently in our suite, along with the version number and link (where applicable), and the number of iterations performed.
|Benchmark Name||Iterations Performed|
|Performance Benchmarks (24 Benchmarks, 56 Tests)|
|Cold Startup Time: Single Tab||3|
|Hot Startup Time: Single Tab||3|
|Cold Startup Time: Eight Tabs||3|
|Hot Startup Time: Eight Tabs||3|
|Uncached Page Load Times (8 Test Pages)||5|
|Cached Page Load Times (8 Test Pages)||5|
|Mozilla Kraken v1.1||2|
|Google SunSpider v0.9.1 Mod||2|
|FutureMark Peacekeeper 2.0||2|
|CSS Stress Test and Performance Profiling - Tom's Hardware||2|
|GUIMark 2 HTML5 (3 Tests)||3|
|HTML5 Canvas Performance Test||2|
|Facebook JSGameBench v4.1||2|
|Mozilla WebGL FishIE||2|
|WebGL Solar System||2|
|RIABench Flash (5 Tests)||3|
|RIABench Java (5 Tests)||3|
|RIABench Silverlight (5 Tests)||3|
|Efficiency Benchmarks (4 Benchmarks/Tests)|
|Memory Usage: Single Tab||3|
|Memory Usage: 40 Tabs||3|
|Memory Management: -39 Tabs||3|
|Memory Management: -39 Tabs (extra 2 minutes)||3|
|Reliability Benchmarks (1 Test)|
|Proper Page Loads||3|
|Responsiveness Benchmarks (1 Test)|
|General Responsiveness Under Load||3|
|Security Benchmarks (1 Test)|
|Conformance Benchmarks (3 Benchmarks/Tests)|
|Peacekeeper 2.0 HTML5 Capabilities||1|
We restart the computer and allow it to idle before benchmarking. Most of our final scores are an average of several iterations. More iterations are run for tests that have short durations, lower scales, and/or higher variance. Any obvious outliers (usually network hiccups) are removed and retested.
Individual detailed methodologies and information regarding composite scoring is described on the corresponding benchmark pages.