How We Test Smartphones And Tablets

Battery Life

The defining feature of a mobile device is its ability to function untethered, with no physical connection required for data or power, making battery life a critical performance metric. Knowing which device will last the longest on a full charge is complicated, however, since it’s affected by many different factors, some of which can be gleaned from a spec sheet and some only through testing.

The battery’s storage capacity plays the biggest role, since this limits the total amount of energy available to the system. With only minimal gains in energy density, the only reasonable way to increase capacity is by increasing size. A bigger battery, however, means a bigger and heavier device, so compromises must be made.

The other half of the battery life story is power consumption, which is influenced by hardware, software, and, ultimately, how you actually use your device. From a hardware perspective, there are a number of different components that drain power, including wireless radios, cameras, speakers, and sensors, but the two biggest culprits are the screen and SoC. Screen size and resolution (more pixels generally use more power), panel technology (AMOLED uses less power than LCD to display black), and panel self refresh (local RAM caches the frame buffer so the GPU and memory bus are not required for static images) all influence display power. The display’s brightness level also effects battery life, which is why we calibrate all screens to 200 nits, removing this variable from our results. The SoC’s power consumption is influenced by process technology, number and type of transistors, power gating, and max core frequencies. Dynamic voltage and frequency scaling (DVFS), a system of software drivers that adjust core and bus frequencies, also has a significant impact on battery life.

Designing battery life tests that account for all of a system’s hardware and software influences is difficult enough without considering different usage scenarios. Do you use your phone to play 3D games or to just occasionally check email? Is the screen powered on for hours at a time or just a few minutes several times a day? Do you get a flood of notifications that keep turning the screen on or have apps constantly running in the background? Since everyone uses their devices differently, we cannot tell you how long your battery will last. Instead, we run some worst-case tests to put a lower bound on battery life, and another test modeled after more real-world usage.

PCMark

PCMark measures system-level performance and battery life by running real-world workloads. The battery life test starts with a full 100% charge and loops the Work performance benchmark (see description in the CPU and System Performance section) until the battery charge reaches 20%. The reported battery life estimates a 95% duty cycle (from 100% to 5% charge remaining) by extrapolating the measured battery life from the test. In addition to showing the battery life in minutes, the overall work performance score (the same value reported in the CPU and System Performance section) is shown again for reference.

As a system-level test, the power consumption of the CPU, GPU, RAM, and screen all factor into the final battery life number. By running realistic workloads, the DVFS functions just as it would when running common apps, providing a more accurate representation of battery life.

TabletMark 2014

This benchmark is similar to PCMark in that it measures system-level performance and battery life by running real-world workloads. Its 7-inch or larger screen requirement limits it to tablets only, however.

The battery life test uses three different scenarios, including Web and Email and Photo and Video Sharing, both of which are explained in the CPU and System Performance section. The other test is Video Playback, which loops three one-minute, 1080p H.264 video clips (~60 MB each) three times for nine minutes total playback time.

The device starts with a full 100% charge and loops the following 50 minute script until the battery dies:

  • Web and Email workload
  • Idle for 3 minutes (screen on)
  • Photo and Video Sharing workload
  • Idle for 3 minutes (screen on)
  • Video Playback workload for 9 minutes
  • Idle to 50 minute mark (~8 minutes)

Basemark OS II: Battery

Basemark OS II by Basemark Ltd. includes a battery rundown test in addition to the performance tests discussed earlier in the CPU and System Performance section. The largely synthetic battery test runs a multi-core workload similar to the CPU performance test, and provides a worst-case battery life primarily based on CPU, memory, and display power consumption.

The test calculates ratios and standard deviations for battery percentage consumed per minute. The final score is based on the arithmetic average of these values plus a bonus score based on CPU usage.

GFXBench 3.0: Battery Life


The GFXBench battery life test loops the T-Rex game simulation benchmark (detailed in the GPU and Gaming Performance section) continuously for 60 minutes, starting from a full 100% charge. This provides a worst-case battery life based primarily on GPU, memory, and display power consumption, and is indicative of what you might see while playing an intense 3D game.

Test results are displayed in two different charts: the extrapolated battery life in minutes and the average performance during the test in frames per second. It’s important to see both charts, because looking at only the battery life chart can be misleading; thermal throttling will cause the GPU to run at a lower frequency, leading to better battery life but lower performance.

Running at a higher clock frequency requires a higher voltage which generates more heat. This heat moves from the core to the SoC package and, eventually, finds its way to the device’s chassis, where it gets dissipated to the surrounding environment. If the core(s) produce heat faster than it can be dissipated, or the external chassis reaches a temperature making it uncomfortable to hold, then the system reduces clock frequency to satisfy thermal constraints. This is what we mean by thermal throttling. Because it reduces performance and can negatively affect the user experience, it’s something our performance testing needs to account for.

  • blackmagnum
    Thank you for clearing this up, Matt. I am sure us readers will show approval with our clicks and regular site visits.
    Reply
  • falchard
    My testing methods amount to looking for the Windows Phone and putting the trophy next to it.
    Reply
  • WyomingKnott
    It's called a phone. Did I miss something? Phones should be tested for call clarity, for volume and distortion, for call drops. This is a set of tests for a tablet.
    Reply
  • MobileEditor
    It's called a phone. Did I miss something? Phones should be tested for call clarity, for volume and distortion, for call drops. This is a set of tests for a tablet.

    It's ironic that the base function of a smartphone is the one thing that we cannot test. There are simply too many variables in play: carrier, location, time of day, etc. I know other sites post recordings of call quality and bandwidth numbers in an attempt to make their reviews appear more substantial and "scientific." All they're really doing, however, is feeding their readers garbage data. Testing the same phone at the same location but at a different time of day will yield different numbers. And unless you work in the same building where they're performing these tests, how is this data remotely relevant to you?

    In reality, only the companies designing the RF components and making the smartphones can afford the equipment and special facilities necessary to properly test wireless performance. This is the reason why none of the more reputable sites test these functions; we know it cannot be done right, and no data is better than misleading data.

    Call clarity and distortion, for example, has a lot to do with the codec used encode the voice traffic. Most carriers still use the old AMR codec, which is strictly a voice codec rather than an audio codec, and is relatively low quality. Some carriers are rolling out AMR wide-band (HD-Voice), which improves call quality, but this is not a universal feature. Even carriers that support it do not support it in all areas.

    What about dropped calls? In the many years of using a cell phone, I can count the number of dropped calls I've had on one hand (that were not the result of driving into a tunnel or stepping into an elevator). How do we test something that occurs randomly and infrequently? If we do get a dropped call, is it the phone's fault or the network's? With only signal strength at the handset, it's impossible to tell.

    If there's one thing we like doing, it's testing stuff, but we're not going to do it if we cannot do it right.

    - Matt Humrick, Mobile Editor, Tom's Hardware
    Reply
  • WyomingKnott
    The reply is much appreciated.

    Not just Tom's (I like the site), but everyone has stopped rating phones on calls. It's been driving me nuts.
    Reply
  • KenOlson
    Matt,

    1st I think your reviews are very well done!

    Question: is there anyway of testing cell phone low signal performance?

    To date I have not found any English speaking reviews doing this.

    Thanks

    Ken
    Reply
  • MobileEditor
    1st I think your reviews are very well done!

    Question: is there anyway of testing cell phone low signal performance?

    Thanks for the compliment :)

    In order to test the low signal performance of a phone, we would need control of both ends of the connection. For example, you could be sitting right next to the cell tower and have an excellent signal, but still have a very slow connection. The problem is that you're sharing access to the tower with everyone else who's in range. So you can have a strong signal, but poor performance because the tower is overloaded. Without control of the tower, we would have no idea if the phone or the network is at fault.

    You can test this yourself by finding a cell tower near a freeway off-ramp. Perform a speed test around 10am while sitting at the stoplight. You'll have five bars and get excellent throughput. Now do the same thing at 5pm. You'll still have five bars, but you'll probably be getting closer to dialup speeds. The reason being that the people in those hundreds of cars stopped on the freeway are all passing the time by talking, texting, browsing, and probably even watching videos.

    - Matt Humrick, Mobile Editor, Tom's Hardware
    Reply