How We Test Smartphones And Tablets

Page 3 of 12:

Benchmark Suite

The mixture of synthetic and real-world tests we run are meant to give a comprehensive overview of a device’s performance. Synthetic tests—which are generally composed of several small, specialized blocks of code for performing operations such as cryptography, file compression, matrix operations, and alpha blending—are good at isolating the performance of different parts of a component’s architecture, including integer and floating point math, specialized instructions, pixel shaders, and rasterization. With this information, we can make comparisons between individual hardware components like SoCs (CPU A is faster than CPU B) or NAND flash. Because of their highly focused nature, however, it can be difficult to relate these results to the overall user experience, limiting us to generic statements about a device being faster because it has a faster CPU in certain benchmarks. Furthermore, synthetic tests are generally designed to push hardware to its limits—useful for determining maximum performance and for spotting weak points in a design—but do not represent real-world workloads.

To truly understand the performance of a device, we need to test it at the component level and at the system level, we need to know its maximum performance and its performance in real-world scenarios, and we also need to spot deficiencies (thermal throttling) and anomalies (unsupported features). No single benchmark can do all of these things. There’s not even a single benchmark that can adequately test any one of these things (creating a good benchmark is extremely difficult and there are always compromises). This is why we run a whole suite of benchmarks, many of which have overlapping functionality.

By now it should be apparent that the benchmarks we use are not randomly selected. In addition to fulfilling the requirements above, our benchmark suite comes from experienced developers who are willing to openly discuss how their benchmarks work. We work closely with most of these developers so that we may gain a better understanding of the tests themselves and to provide them with feedback for improving their tests. The table below lists the benchmarks we currently use to test mobile devices.

Google Android

Swipe to scroll horizontally

Category	Benchmark	Version	Developer
CPU And System Performance	AndEBench Pro 2015	2.1.2472	EEMBC
Basemark OS II Full	2.0	Basemark Ltd
Geekbench 3	3.3.1	Primate Labs
MobileXPRT 2013	1.0.92.1	Principled Technologies
PCMark	1.1	Futuremark
TabletMark 2014	3.0.0.63	BAPCo
Browsermark	2.1	Basemark Ltd
JSBench	2013.1	Purdue University
Google Octane	2.0	Google
Peacekeeper	-	Futuremark
GPU And Gaming Performance	3DMark: Ice Storm Unlimited	1.2	Futuremark
Basemark X	1.1	Basemark Ltd
GFXBench 3 Corporate	3.0.28	Kishonti
GFXBench 3.1 Corporate	3.1.0	Kishonti
Basemark ES 3.1	1.0.2	Basemark Ltd
Battery Life And Thermal Throttling	Basemark OS II Full	2.0	Basemark Ltd
GFXBench 3 Corporate	3.0.28	Kishonti
PCMark	1.1	Futuremark
TabletMark 2014	3.0.0.63	BAPCo

Apple iOS

Swipe to scroll horizontally

Category	Benchmark	Version	Developer
CPU And System Performance	Basemark OS II Full	2.0	Basemark Ltd
Geekbench 3	3.3.4	Primate Labs
TabletMark 2014	3.0.0.63	BAPCo
Browsermark	2.1	Basemark Ltd
JSBench	2013.1	Purdue University
Google Octane	2.0	Google
Peacekeeper	-	Futuremark
GPU And Gaming Performance	3DMark: Ice Storm Unlimited	1.2	Futuremark
Basemark X	1.1	Basemark Ltd
GFXBench 3 Corporate	3.0.32	Kishonti
GFXBench 3.1 Corporate	3.1.0	Kishonti
Basemark ES 3.1	1.0.2	Basemark Ltd
Battery Life And Thermal Throttling	Basemark OS II Full	2.0	Basemark Ltd
GFXBench 3 Corporate	3.0.32	Kishonti
TabletMark 2014	3.0.0.63	BAPCo

Microsoft Windows Phone

Swipe to scroll horizontally

Category	Benchmark	Version	Developer
CPU And System Performance	Basemark OS II Full	2.0	Basemark Ltd
Browsermark	2.1	Basemark Ltd
JSBench	2013.1	Purdue University
Google Octane	2.0	Google
Peacekeeper	-	Futuremark
GPU And Gaming Performance	Basemark X	1.1	Basemark Ltd
GFXBench 3 DirectX	3.0.4	Kishonti
Battery Life And Thermal Throttling	Basemark OS II Full	2.0	Basemark Ltd
GFXBench 3 DirectX	3.0.4	Kishonti

Current page: Benchmark Suite

Prev Page Testing Methodology Next Page Display Performance

TOPICS

7 Comments Comment from the forums

blackmagnum

Thank you for clearing this up, Matt. I am sure us readers will show approval with our clicks and regular site visits.
Reply
falchard

My testing methods amount to looking for the Windows Phone and putting the trophy next to it.
Reply
WyomingKnott

It's called a phone. Did I miss something? Phones should be tested for call clarity, for volume and distortion, for call drops. This is a set of tests for a tablet.
Reply
MobileEditor

It's called a phone. Did I miss something? Phones should be tested for call clarity, for volume and distortion, for call drops. This is a set of tests for a tablet.

It's ironic that the base function of a smartphone is the one thing that we cannot test. There are simply too many variables in play: carrier, location, time of day, etc. I know other sites post recordings of call quality and bandwidth numbers in an attempt to make their reviews appear more substantial and "scientific." All they're really doing, however, is feeding their readers garbage data. Testing the same phone at the same location but at a different time of day will yield different numbers. And unless you work in the same building where they're performing these tests, how is this data remotely relevant to you?

In reality, only the companies designing the RF components and making the smartphones can afford the equipment and special facilities necessary to properly test wireless performance. This is the reason why none of the more reputable sites test these functions; we know it cannot be done right, and no data is better than misleading data.

Call clarity and distortion, for example, has a lot to do with the codec used encode the voice traffic. Most carriers still use the old AMR codec, which is strictly a voice codec rather than an audio codec, and is relatively low quality. Some carriers are rolling out AMR wide-band (HD-Voice), which improves call quality, but this is not a universal feature. Even carriers that support it do not support it in all areas.

What about dropped calls? In the many years of using a cell phone, I can count the number of dropped calls I've had on one hand (that were not the result of driving into a tunnel or stepping into an elevator). How do we test something that occurs randomly and infrequently? If we do get a dropped call, is it the phone's fault or the network's? With only signal strength at the handset, it's impossible to tell.

If there's one thing we like doing, it's testing stuff, but we're not going to do it if we cannot do it right.

- Matt Humrick, Mobile Editor, Tom's Hardware
Reply
WyomingKnott

The reply is much appreciated.

Not just Tom's (I like the site), but everyone has stopped rating phones on calls. It's been driving me nuts.
Reply
KenOlson

Matt,

1st I think your reviews are very well done!

Question: is there anyway of testing cell phone low signal performance?

To date I have not found any English speaking reviews doing this.

Thanks

Ken
Reply
MobileEditor

1st I think your reviews are very well done!

Question: is there anyway of testing cell phone low signal performance?

Thanks for the compliment :)

In order to test the low signal performance of a phone, we would need control of both ends of the connection. For example, you could be sitting right next to the cell tower and have an excellent signal, but still have a very slow connection. The problem is that you're sharing access to the tower with everyone else who's in range. So you can have a strong signal, but poor performance because the tower is overloaded. Without control of the tower, we would have no idea if the phone or the network is at fault.

You can test this yourself by finding a cell tower near a freeway off-ramp. Perform a speed test around 10am while sitting at the stoplight. You'll have five bars and get excellent throughput. Now do the same thing at 5pm. You'll still have five bars, but you'll probably be getting closer to dialup speeds. The reason being that the people in those hundreds of cars stopped on the freeway are all passing the time by talking, texting, browsing, and probably even watching videos.

- Matt Humrick, Mobile Editor, Tom's Hardware
Reply

Show more comments