How We Test Smartphones And Tablets

Testing Methodology

Collecting accurate, repeatable, and fair results from such a noisy environment requires a strict testing methodology. Here at Tom’s Hardware, we’ve used our knowledge and experience to develop a procedure that minimizes background tasks and strives to create a level playing field for all devices—as well as we can anyway, since there are some variables beyond our control.

Before discussing the details, however, we should answer a more basic question: Where do our review units come from? In some cases, we purchase the products ourselves, but most of the units we review are retail products provided by OEMs.

While Tom’s Hardware attracts readers from all over the world, the main site is based in the United States (there are other sites in the Tom’s family focusing on different regions). Therefore, the devices we test are models intended for sale in the North American market. The increasing importance of markets outside of the US, however, means many OEMs are launching products in these other regions first. Because of the media’s unhealthy obsession with being the first to post, many tech news sites are now reviewing international or even pre-production units running different software and often exhibiting different performance characteristics than the North American retail models. Many of these sites do not even disclose this information. We feel that this potentially misleading practice is not in our reader’s best interest, however. If we do test an international or pre-production unit outside of a full review, we will always disclose this within the article.

Configuration

So, after acquiring our North American retail units, what’s next? The first thing we do is install operating system and software updates. Next, we perform a factory reset to make sure we’re starting from a clean slate. The final step in preparing a device for testing involves diving into the settings menu. We’ll spare you our full list of configuration settings (we go through every possible menu), which are meant to minimize background tasks and keep comparisons as fair as possible, and just show you the most important ones in the table below.

Swipe to scroll horizontally
Setting               StateNotes
Wireless & Data
Airplane ModeonThis is an uncontrollable variable because signal strength (and thus power draw) varies by location, carrier, time of day, etc. The cellular radio is powered down to keep device testing fair.
Wi-FionRow 2 - Cell 2
BluetoothoffRow 3 - Cell 2
NFConRow 4 - Cell 2
Location ServicesoffReduces background activity
Data CollectionoffOptions for improving customer experience and sending diagnostic data, usage statistics, etc. to Google, OEMs, or cellular providers are disabled to reduce background activity.
Display
Auto or Adaptive BrightnessoffRow 8 - Cell 2
Display BrightnessRow 9 - Cell 1 The screens are calibrated to 200 nits, keeping results comparable between devices.
Special display settingsoffRow 10 - Cell 2
Screen ModeRow 11 - Cell 1 Set to basic, native, standard, sRGB, or device default. When testing the display, each mode is tested separately.
WallpaperRow 12 - Cell 1 Default
Battery
Battery saving modesoffDevice’s implement different techniques. Turning these off shows battery life without having to sacrifice performance.
Turn on automaticallyneverRow 15 - Cell 2
User Accounts
Google, iCloud, Facebook, Twitter, etc.inactiveAll cloud-based accounts (or any account that accesses the internet in the background) are deleted after initial setup. This reduces background activity that interferes with measurements. The only exception is the Microsoft account for Windows Phone which cannot be removed.
Auto-sync DataoffRow 18 - Cell 2

In order to keep testing fair and results comparable, we strive to make each device’s configuration as similar as possible. Due to differences in operating systems and features, however, there will always be some small differences. In these situations, we leave the settings at their default values.

Devices may also contain pre-installed software from cellular providers and OEMs, introducing more variability. When present, we do not remove or disable this software—it’s usually not possible and the average user will likely leave it running anyway. To help mitigate this issue, we try to get either unlocked devices or devices from carriers with the least amount of “bloatware.”

Testing Procedure

A consistent testing procedure is just as important to the data collection process as our device configuration profile. Since power level and temperature both effect a device’s performance, tests are performed in a controlled manner that accounts for these factors. Below are the main points of our testing procedure:

  • The ambient room temperature is kept between 70 °F (21 °C) – 80 °F (26.5 °C). We do not actively cool the devices during testing. While this would further reduce the possibility of thermal throttling affecting the results, it’s not a realistic condition. After all, none of us carry around bags of ice, fans, or thermoelectric coolers in our pockets to cool our devices.
  • Smartphones lie flat on a wood table (screen facing up) during testing, with tests conducted in portrait mode unless forced to run landscape by the app. Tablets are propped up in a holder in landscape mode, so that the entire backside of the device is exposed to air. This is to better simulate real-world usage.
  • Devices are allowed to sit for a specified length of time after they are turned on to allow initial network syncing and background tasks to complete before starting a test.
  • Devices are not touched or moved while tests are running.
  • Devices are allowed to cool between test runs so that subsequent tests are not affected by heat buildup.
  • All tests are performed while running on battery power. The battery charge level is not allowed to drop below a specific value while performing any performance measurement other than battery life.

Benchmarks are run at least two times and the results are averaged to get a final score. The minimum and maximum values from each benchmark run must not vary from the computed average value by more than 5%. If the variance threshold is exceeded, all of the benchmark scores for that run are discarded and the benchmark is run again. This ensures that the occasional outlier caused by random background tasks, cosmic rays, spooky quantum effects, interference from technologically advanced aliens, disturbances in the space-time continuum, or other unexplainable phenomena do not skew the final results.

Our testing procedure also includes several methods for detecting benchmark cheats.

  • blackmagnum
    Thank you for clearing this up, Matt. I am sure us readers will show approval with our clicks and regular site visits.
    Reply
  • falchard
    My testing methods amount to looking for the Windows Phone and putting the trophy next to it.
    Reply
  • WyomingKnott
    It's called a phone. Did I miss something? Phones should be tested for call clarity, for volume and distortion, for call drops. This is a set of tests for a tablet.
    Reply
  • MobileEditor
    It's called a phone. Did I miss something? Phones should be tested for call clarity, for volume and distortion, for call drops. This is a set of tests for a tablet.

    It's ironic that the base function of a smartphone is the one thing that we cannot test. There are simply too many variables in play: carrier, location, time of day, etc. I know other sites post recordings of call quality and bandwidth numbers in an attempt to make their reviews appear more substantial and "scientific." All they're really doing, however, is feeding their readers garbage data. Testing the same phone at the same location but at a different time of day will yield different numbers. And unless you work in the same building where they're performing these tests, how is this data remotely relevant to you?

    In reality, only the companies designing the RF components and making the smartphones can afford the equipment and special facilities necessary to properly test wireless performance. This is the reason why none of the more reputable sites test these functions; we know it cannot be done right, and no data is better than misleading data.

    Call clarity and distortion, for example, has a lot to do with the codec used encode the voice traffic. Most carriers still use the old AMR codec, which is strictly a voice codec rather than an audio codec, and is relatively low quality. Some carriers are rolling out AMR wide-band (HD-Voice), which improves call quality, but this is not a universal feature. Even carriers that support it do not support it in all areas.

    What about dropped calls? In the many years of using a cell phone, I can count the number of dropped calls I've had on one hand (that were not the result of driving into a tunnel or stepping into an elevator). How do we test something that occurs randomly and infrequently? If we do get a dropped call, is it the phone's fault or the network's? With only signal strength at the handset, it's impossible to tell.

    If there's one thing we like doing, it's testing stuff, but we're not going to do it if we cannot do it right.

    - Matt Humrick, Mobile Editor, Tom's Hardware
    Reply
  • WyomingKnott
    The reply is much appreciated.

    Not just Tom's (I like the site), but everyone has stopped rating phones on calls. It's been driving me nuts.
    Reply
  • KenOlson
    Matt,

    1st I think your reviews are very well done!

    Question: is there anyway of testing cell phone low signal performance?

    To date I have not found any English speaking reviews doing this.

    Thanks

    Ken
    Reply
  • MobileEditor
    1st I think your reviews are very well done!

    Question: is there anyway of testing cell phone low signal performance?

    Thanks for the compliment :)

    In order to test the low signal performance of a phone, we would need control of both ends of the connection. For example, you could be sitting right next to the cell tower and have an excellent signal, but still have a very slow connection. The problem is that you're sharing access to the tower with everyone else who's in range. So you can have a strong signal, but poor performance because the tower is overloaded. Without control of the tower, we would have no idea if the phone or the network is at fault.

    You can test this yourself by finding a cell tower near a freeway off-ramp. Perform a speed test around 10am while sitting at the stoplight. You'll have five bars and get excellent throughput. Now do the same thing at 5pm. You'll still have five bars, but you'll probably be getting closer to dialup speeds. The reason being that the people in those hundreds of cars stopped on the freeway are all passing the time by talking, texting, browsing, and probably even watching videos.

    - Matt Humrick, Mobile Editor, Tom's Hardware
    Reply