How We Test Smartphones And Tablets
Today, we outline the strict testing procedures used to obtain accurate data and discuss each test that we perform on smartphones and tablets.
Benchmark Suite
The mixture of synthetic and real-world tests we run are meant to give a comprehensive overview of a device’s performance. Synthetic tests—which are generally composed of several small, specialized blocks of code for performing operations such as cryptography, file compression, matrix operations, and alpha blending—are good at isolating the performance of different parts of a component’s architecture, including integer and floating point math, specialized instructions, pixel shaders, and rasterization. With this information, we can make comparisons between individual hardware components like SoCs (CPU A is faster than CPU B) or NAND flash. Because of their highly focused nature, however, it can be difficult to relate these results to the overall user experience, limiting us to generic statements about a device being faster because it has a faster CPU in certain benchmarks. Furthermore, synthetic tests are generally designed to push hardware to its limits—useful for determining maximum performance and for spotting weak points in a design—but do not represent real-world workloads.
For this reason, we also try to include benchmarks that test macro-level activities you do every day such as web browsing, composing text, editing photos, or watching a video. While these benchmarks are a better indicator of overall user experience, they are much more difficult to develop, leaving testers few options for mobile platforms.
To truly understand the performance of a device, we need to test it at the component level and at the system level, we need to know its maximum performance and its performance in real-world scenarios, and we also need to spot deficiencies (thermal throttling) and anomalies (unsupported features). No single benchmark can do all of these things. There’s not even a single benchmark that can adequately test any one of these things (creating a good benchmark is extremely difficult and there are always compromises). This is why we run a whole suite of benchmarks, many of which have overlapping functionality.
By now it should be apparent that the benchmarks we use are not randomly selected. In addition to fulfilling the requirements above, our benchmark suite comes from experienced developers who are willing to openly discuss how their benchmarks work. We work closely with most of these developers so that we may gain a better understanding of the tests themselves and to provide them with feedback for improving their tests. The table below lists the benchmarks we currently use to test mobile devices.
Google Android
Category | Benchmark | Version | Developer |
---|---|---|---|
CPU And System Performance | AndEBench Pro 2015 | 2.1.2472 | EEMBC |
Basemark OS II Full | 2.0 | Basemark Ltd | |
Geekbench 3 | 3.3.1 | Primate Labs | |
MobileXPRT 2013 | 1.0.92.1 | Principled Technologies | |
PCMark | 1.1 | Futuremark | |
TabletMark 2014 | 3.0.0.63 | BAPCo | |
Browsermark | 2.1 | Basemark Ltd | |
JSBench | 2013.1 | Purdue University | |
Google Octane | 2.0 | ||
Peacekeeper | - | Futuremark | |
GPU And Gaming Performance | 3DMark: Ice Storm Unlimited | 1.2 | Futuremark |
Basemark X | 1.1 | Basemark Ltd | |
GFXBench 3 Corporate | 3.0.28 | Kishonti | |
GFXBench 3.1 Corporate | 3.1.0 | Kishonti | |
Basemark ES 3.1 | 1.0.2 | Basemark Ltd | |
Battery Life And Thermal Throttling | Basemark OS II Full | 2.0 | Basemark Ltd |
GFXBench 3 Corporate | 3.0.28 | Kishonti | |
PCMark | 1.1 | Futuremark | |
TabletMark 2014 | 3.0.0.63 | BAPCo |
Apple iOS
Category | Benchmark | Version | Developer |
---|---|---|---|
CPU And System Performance | Basemark OS II Full | 2.0 | Basemark Ltd |
Geekbench 3 | 3.3.4 | Primate Labs | |
TabletMark 2014 | 3.0.0.63 | BAPCo | |
Browsermark | 2.1 | Basemark Ltd | |
JSBench | 2013.1 | Purdue University | |
Google Octane | 2.0 | ||
Peacekeeper | - | Futuremark | |
GPU And Gaming Performance | 3DMark: Ice Storm Unlimited | 1.2 | Futuremark |
Basemark X | 1.1 | Basemark Ltd | |
GFXBench 3 Corporate | 3.0.32 | Kishonti | |
GFXBench 3.1 Corporate | 3.1.0 | Kishonti | |
Basemark ES 3.1 | 1.0.2 | Basemark Ltd | |
Battery Life And Thermal Throttling | Basemark OS II Full | 2.0 | Basemark Ltd |
GFXBench 3 Corporate | 3.0.32 | Kishonti | |
TabletMark 2014 | 3.0.0.63 | BAPCo |
Microsoft Windows Phone
Category | Benchmark | Version | Developer |
---|---|---|---|
CPU And System Performance | Basemark OS II Full | 2.0 | Basemark Ltd |
Browsermark | 2.1 | Basemark Ltd | |
JSBench | 2013.1 | Purdue University | |
Google Octane | 2.0 | ||
Peacekeeper | - | Futuremark | |
GPU And Gaming Performance | Basemark X | 1.1 | Basemark Ltd |
GFXBench 3 DirectX | 3.0.4 | Kishonti | |
Battery Life And Thermal Throttling | Basemark OS II Full | 2.0 | Basemark Ltd |
GFXBench 3 DirectX | 3.0.4 | Kishonti |
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
-
blackmagnum Thank you for clearing this up, Matt. I am sure us readers will show approval with our clicks and regular site visits.Reply -
falchard My testing methods amount to looking for the Windows Phone and putting the trophy next to it.Reply -
WyomingKnott It's called a phone. Did I miss something? Phones should be tested for call clarity, for volume and distortion, for call drops. This is a set of tests for a tablet.Reply -
MobileEditor It's called a phone. Did I miss something? Phones should be tested for call clarity, for volume and distortion, for call drops. This is a set of tests for a tablet.
It's ironic that the base function of a smartphone is the one thing that we cannot test. There are simply too many variables in play: carrier, location, time of day, etc. I know other sites post recordings of call quality and bandwidth numbers in an attempt to make their reviews appear more substantial and "scientific." All they're really doing, however, is feeding their readers garbage data. Testing the same phone at the same location but at a different time of day will yield different numbers. And unless you work in the same building where they're performing these tests, how is this data remotely relevant to you?
In reality, only the companies designing the RF components and making the smartphones can afford the equipment and special facilities necessary to properly test wireless performance. This is the reason why none of the more reputable sites test these functions; we know it cannot be done right, and no data is better than misleading data.
Call clarity and distortion, for example, has a lot to do with the codec used encode the voice traffic. Most carriers still use the old AMR codec, which is strictly a voice codec rather than an audio codec, and is relatively low quality. Some carriers are rolling out AMR wide-band (HD-Voice), which improves call quality, but this is not a universal feature. Even carriers that support it do not support it in all areas.
What about dropped calls? In the many years of using a cell phone, I can count the number of dropped calls I've had on one hand (that were not the result of driving into a tunnel or stepping into an elevator). How do we test something that occurs randomly and infrequently? If we do get a dropped call, is it the phone's fault or the network's? With only signal strength at the handset, it's impossible to tell.
If there's one thing we like doing, it's testing stuff, but we're not going to do it if we cannot do it right.
- Matt Humrick, Mobile Editor, Tom's Hardware -
WyomingKnott The reply is much appreciated.Reply
Not just Tom's (I like the site), but everyone has stopped rating phones on calls. It's been driving me nuts. -
KenOlson Matt,Reply
1st I think your reviews are very well done!
Question: is there anyway of testing cell phone low signal performance?
To date I have not found any English speaking reviews doing this.
Thanks
Ken -
MobileEditor 1st I think your reviews are very well done!
Question: is there anyway of testing cell phone low signal performance?
Thanks for the compliment :)
In order to test the low signal performance of a phone, we would need control of both ends of the connection. For example, you could be sitting right next to the cell tower and have an excellent signal, but still have a very slow connection. The problem is that you're sharing access to the tower with everyone else who's in range. So you can have a strong signal, but poor performance because the tower is overloaded. Without control of the tower, we would have no idea if the phone or the network is at fault.
You can test this yourself by finding a cell tower near a freeway off-ramp. Perform a speed test around 10am while sitting at the stoplight. You'll have five bars and get excellent throughput. Now do the same thing at 5pm. You'll still have five bars, but you'll probably be getting closer to dialup speeds. The reason being that the people in those hundreds of cars stopped on the freeway are all passing the time by talking, texting, browsing, and probably even watching videos.
- Matt Humrick, Mobile Editor, Tom's Hardware