Test Suite And Methodology
Changes
The Start and Page Load test pages are updated to current versions, with the exception of the Tom's Hardware Wikipedia page, which has not changed since the last update. We say goodbye to YouTube, eBay, and The Huffington Post. In their place is an About.com page on barbeque, a randomly-selected popular question on Ask.com, and my own LinkedIn profile page. The Google homepage is replaced with the search results page for "Tom's Hardware", and craigslist is now the "free stuff" results page for Los Angeles. Amazon remains as the Computer Parts & Components page, though we update it. And finally, the Yahoo homepage is updated and now serves as the single-tab test page in place of the old Google homepage.
We're introducing the remaining RIABench JavaScript, Flash, Java, and Silverlight tests. RIABench JavaScript consists of eight tests, Java has seven tests, while Flash and Silverlight have all ten tests. The final RIABench scores are now the geometric mean of individual tests, instead of simple averages.
CSS Stress Testing & Performance Profiling is now performed on the CSS version of the CSS3 Speed Test demo page. Microsoft's Maze Solver CSS3 benchmark is being retired in favor of the bookmarklet applied to the CCS3 version of the very same CSS3 Speed Test demo page.
We're also adding The CSS3 Test to our standards conformance tests. It replaces the HTML5 Capabilities section of Futuremark Peacekeeper 2.0. HTML5Test.com is now the sole HTML5 conformance test in the standards conformance composite, providing an even split between JavaScript, HTML5, and CSS3 in our final standards conformance grade.
Last but not least, the JavaScript Composite score is also being switched to geometric mean instead of the inverse averages we used in the previous installment.
Web Browser Grand Prix Test Suite v12
The table below lists all 34 benchmarks (consisting of 77 individual tests) currently in our suite, along with the version number and link (where applicable), and the number of iterations performed.
Benchmark Name | Iterations Performed |
---|---|
Performance Benchmarks (24 Benchmarks, 67 Tests) | |
Cold Startup Time: Single Tab | 3 |
Hot Startup Time: Single Tab | 3 |
Cold Startup Time: Eight Tabs | 3 |
Hot Startup Time: Eight Tabs | 3 |
Uncached Page Load Times (Eight Test Pages) | 5 |
Cached Page Load Times (Eight Test Pages) | 5 |
RIABench JavaScript (Eight Tests) | 3 |
Mozilla Kraken v1.1 | 2 |
Google SunSpider v0.9.1 Mod | 2 |
FutureMark Peacekeeper 2.0 | 2 |
Dromaeo DOM Core | 1 |
CSS Stress Test and Performance Profiling - CSS Speed Test | 2 |
CSS Stress Test and Performance Profiling - CSS3 Speed Test | 2 |
GUIMark 2 HTML5 (3 Tests) | 3 |
Asteroids HTML5 Canvas 2D And JavaScript | 2 |
HTML5 Canvas Performance Test | 2 |
Facebook JSGameBench v4.1 | 2 |
Psychedelic Browsing | 2 |
WebVizBench | 2 |
Mozilla WebGL FishIE | 2 |
WebGL Solar System | 2 |
RIABench Flash (10 Tests) | 3 |
RIABench Java (7 Tests) | 3 |
RIABench Silverlight (10 Tests) | 3 |
Efficiency Benchmarks (Four Benchmarks/Tests) | |
Memory Usage: Single Tab | 3 |
Memory Usage: 40 Tabs | 3 |
Memory Management: -39 Tabs | 3 |
Memory Management: -39 Tabs (extra 2 minutes) | 3 |
Reliability Benchmarks (One Test) | |
Proper Page Loads | 3 |
Responsiveness Benchmarks (One Test) | |
General Responsiveness Under Load | 3 |
Security Benchmarks (One Test) | |
BrowserScope Security | 1 |
Conformance Benchmarks (Three Benchmarks/Tests) | |
Ecma Language test262 | 1 |
HTML5Test.com | 1 |
The CSS3 Test | 1 |
Methodology
We restart the computer and allow it to idle before benchmarking. Most individual benchmark final scores are an average of several iterations. More iterations are run for tests that have short durations, lower scales, and/or higher variance. Any obvious outliers (usually network hiccups) are removed and retested.
We switched most of the composite scores from arithmetic mean (average) to geometric mean in order to ensure that every test in each category is given equal weight, regardless of absolute value. The exceptions are the Standards Conformance grade and Memory Efficiency score, which are achieved differently, as well as the Reliability, Responsiveness, and Security composites, each of which only contains a single test.
Individual detailed methodologies and information regarding composite scoring is described on the corresponding benchmark pages.