Not to put too fine a point on it, but the future of Google’s Tango AR smartphone capabilities rests squarely on the shoulders of Lenovo and its Phab 2 Pro smartphone (well, and a little bit on Asus and its Zenfone AR). The Phab 2 Pro is presently the only Tango-enabled device on the market (the Zenfone AR is not yet available). It’s unclear as to exactly why that is, but whether real or perceived, Lenovo and Asus are the only device makers bold enough to release a consumer device equipped with these particular augmented reality capabilities. They deserve as much acclaim for that as they bear the responsibility of not screwing it all up.
We spent a stretch with the Phab 2 Pro, which we detail below, but to spoil it up front: Tango on this (or possibly any) smartphone is a half measure. But it’s a necessary half measure.
Augmented reality and mixed reality hold immense promise for consumers, and Tango is one of the more promising projects aimed at delivering it--if for no other reason than you can buy it today. Having it on a smartphone is a tremendous step forward, but, to be blunt, when it comes to AR/MR, a smartphone is a terrible viewer and a worse user interface.
What Is Tango?
Tango is computer vision technology that creates augmented reality experiences. It has three pillars: motion tracking, area learning, and depth perception. The way that Tango knits these pieces together into an augmented reality quilt is both fascinating and a bit tedious-sounding, with a number of components that have to play nicely together.
The motion tracking is handled by a wide-angle fisheye camera, an accelerometer, and a gyroscope. The camera captures live images from which the software can “see” things like edges and corners. It compares those landmarks frame to frame to frame and can do the math to suss out how far the camera moves between each frame. Then the accelerometer and gyroscope step in to do their thing: measuring how fast the device is moving and how it’s oriented in space. (Note well that all three of those items already exist on essentially all mid-range and higher smartphones.)
The area learning component is really just a logical extension of motion tracking; it’s another way of saying that the software “remembers” what the motion tracking components “see,” both in the immediate and in the long term. That is to say, you can quickly scan a room, and the device will “know it” long enough for you to play your game or what have you, but it can also store that information and recall it the next time you visit that same room. It could be 10 minutes later or a year later. If anything has changed in the room, Tango can combine stored area learning data with new, real-time motion tracking data to achieve greater accuracy.
Think of it like the connection between your eyes (motion tracking) and your brain (area learning): You walk into a room, and your eyes see all the objects therein and note how far away the walls and ceiling are. As you move about the room, you maintain this sense of the location of the things you’ve already seen in the room, even if you’re no longer looking at them. You have learned the area. If you return to the same room later, and nothing has been moved, you could close your eyes and navigate it safely, because your brain will recall that stored area learning data. If you return to the same room later, and a bunch of things have been moved, added, or removed, your previous knowledge of the space will combine with the new data your eyeballs acquire so you can stride confidently into the space without smacking into anything.
But if your brain never jotted down that data in the first place, you would forget what was in the room as soon as you stopped looking at those objects. In the same way, motion tracking without area learning is sorely lacking.
The third piece of the Tango puzzle is depth perception, and this part is key to unlocking most of the AR capabilities of the device. Motion tracking is, in a way, almost like making assumptions about the space and where the device is within it. It’s grabbing an edge and then measuring relative distances by comparing frames. A depth sensor, though, actually measures where everything is in relation to the device with a time-of-flight IR emitter. The infrared light bounces off of everything it can find; then it bounces back to the device, laden with information; then in a flash (Lenovo said “a few nanoseconds” in its materials), that data is measured, and an RGB camera creates a stereo image. Poof, depth perception. Once the device has that, it knows both where to place virtual objects in space and also enables them to interact with real objects and spaces.
Lenovo noted that it had to get creative to fit all of the above into the small (well, relatively small) smartphone form factor. It partnered with Infineon and PMD to make a custom depth sensor that put the IR emitter and RGB camera onto one unit to save space, and in order to get all three camera modules to fit within the Phab 2 Pro’s z-height, Lenovo had to nestle them in holes in the PCB.
In its materials, the company also stated that solving the balance issues of having so many modules in a device was difficult. (Though we would assert that, although the balance is by no means terrible, “solved” might be a strong word.)
Lenovo further described some of the challenges it had to overcome in designing and building the Phab 2 Pro:
Naturally, with so many sensors operating simultaneously, the challenges included mitigating RF interference, optimizing power consumption, and improving heat dissipation. The Phab 2 Pro has a custom thermal pipe that dissipates heat away from the processor. Lenovo also added thermal pastes between the shielding cover and heat pipe to drive dissipation. Other materials like graphite, aluminum, and copper were used to manage efficient heat dissipation throughout the phone.
Although there are certain components that must be present on any Tango-enabled device for the technology to work, it appears that there’s no standard way to present them necessarily. Note, for example, the back of the Phab 2 Pro compared to the Asus ZenFone AR (below). The two manufacturers took quite different approaches to placement. On the Phab 2 Pro, the whole assembly is in a vertical alignment, whereas the Zenfone AR has the camera modules aligned horizontally.
Ginormity, And Other Issues
Someone mentioned to me that the Phab 2 Pro isn’t much of a phone. We didn’t test the Phab 2 Pro as a phone, but anecdotally, after using it as my primary smartphone for many weeks, I would agree with the aforementioned sentiment.
The Phab 2 Pro is enormous. (Anecdote: The first time I had the phone out, my 7-year-old literally gasped at its great size.) It’s really a phablet--and even though I have (at least) average-sized man hands, I found that using it without both of my hands was difficult in normal situations and impossible in others. I even had to stop keeping it in my pants pocket because it slid out everywhere I sat--at my desk, on the couch, in the car, you name it--so to avoid losing it or having it fall to the ground and become damaged, I had to leave it on my desk or a side table, or tucked into a larger coat pocket when I was out and about. (Note: Eventually it did fall out of my pocket and onto a hard floor, and the screen cracked. Sigh.)
The battery life wasn’t an issue, unless I used it for AR, but even when I did use the AR capabilities for several minutes at a time, the battery didn’t drain so much that I had to charge the device back up before the end of the day. However, whenever I used AR apps, the Phab 2 Pro got hot, and fast.
Also, the headphone jack is, as they say, jacked. Whether using it to listen to podcasts or while on calls, headsets and ear buds often didn’t work.
We should note, though, that storage was not an issue. The device has 64GB of capacity, and even with dozens of apps installed (not to mention the area learning data they stored) and after shooting many videos and snapping numerous screenshots, we used up just over 20GB.
Even so, you probably don’t want to buy a Phab 2 Pro if you’re just looking for a nice, higher-end smartphone; only splurge if you’re keen on the AR capabilities Tango affords.
|Lenovo Phab 2 Pro|
|Display||6.4-inch Quad HD (2560x1440), 2K IPS Assertive2.5D curved glass|
|CPU||Qualcomm Snapdragon 652 Processor (Built for Tango)|
|Storage||64 GB (up to 128 GB via microSD)|
|Camera||Rear: 16 MP PDAF Fast-Focus, Depth Sensor and Motion TrackingFront: 8 MP Fixed-Focus (F2.2 aperture)|
|Operating System||Android 6.0.1 (Marshmallow)|
|Battery||-4050 mAh Li-ion + Fast-charge -Standby time: over 13 days -Talk time: 18 hours|
|Dual SIM||Nano SIM & microSD (Up to 128 GB)|
|Connectivity||802.11a/b/g/n/ac, 2.4 GHz / 5 GHz Wi-FiBluetooth 4.0|
|Sound||-Triple array mic w/ Active Noise Cancellation-Dolby Atmos + Dolby Audio Capture 5.1-3.5mm audio jack|
|Sensors||G-Sensor P-Sensor L-Sensor E-Compass Gyroscope Hall Sensor Vibrator|
|Colors||Champagne Gold, Gunmetal Gray|
|Body||Aluminum alloy (unibody)|
|Price||$499, unlocked (at Lenovo, Amazon, and B&H)|
Apps And Performance
Short of being able to benchmark Tango in any way, we sought to spend time using every Tango app we could to get a clear sense of what the technology can do. There are dozens in the Play Store, but the easiest way to find them is to use the Tango app that comes installed on the phone. It links to an area of the Play Store with featured Tango apps. We found a few more just by digging around; some seemed promising, whereas others--like the one that misspelled “augmented” in its description--we avoided. Suffice it to say that although there are some compelling apps and games available for Tango already, others fell a little flat.
One of the issues with Tango at present--stop me if you’ve heard this one before--is that the app ecosystem is quite small. However, considering that the Phab 2 Pro is the only shipping device that has Tango on board, though, one could assert that the dozens of Tango apps already in the Play Store is a good sign.
There are essentially three types of apps you’ll find at present: Mostly, there are games, as well as apps for the home and shopping, and some dev tools. Some feel quite complete, whereas others were clearly built as experiments that explore what Tango can do. The range of quality varies significantly.
First, though, some notes on the experience as a whole: Tango is fickle. Finding a suitable area to use the apps was difficult. It can’t be too bright or too dark, you have to avoid shiny surfaces and glass because otherwise the camera can’t track, and there mustn't be any clutter. Tango craves simple, clear edges. Basically, then, to get the best results, you should be in a sealed, square room with few or no contents and one overhead light. How many rooms in your home or office fit that description?
My cluttered office--yes, it’s always cluttered, don’t judge--was unusable for anything Tango-related. The best I could do was my dining room, which is currently empty because it’s being renovated (which you may have noticed in the video above). However, this actually turned out to be serendipitous for some of the apps we tested. We’ll get to that in a bit.
We experienced numerous issues when using Tango apps, generally speaking. When we used a couple of Tango apps in a row, we almost always had to reboot the phone. The apps would crash, or the camera wouldn’t work, or the apps would just act glitchy. That last bit was immensely frustrating, because we would have to evaluate: Is this just a bad UI? Is the app itself glitching out, or is it Tango? Sometimes restarting the app solved the problem, but usually we had to reboot the phone. That didn’t always solve the problem, so we would reboot again; in some cases we had to just write off a given app as hopelessly broken and move on.
Further, in virtually every app we used, there was some drift. Typically, you pick a spot and drop whatever item the game or app has to offer, and it’s supposed to stay rooted to that spot. This is how you’re able to walk around it and view it in 3D, from any angle. Other apps had you scan the whole room as a first step, using the area learning data to more accurately place objects. In no instance did the objects, furniture, playspaces, etc. stick precisely to their spots. It was most grating when we, for example, tried to place multiple pieces of furniture in a room. Quickly, the space became a jumble of misbehaving virtual furniture.
None of the games we played were what you would call revolutionary, but some certainly showed how 3D and area learning capabilities make basic games that much better. For example, Fury of the Gods is just a tower defense game, but the thing you’re defending is 3D, and once you place the world on a tabletop, you can run all around and zap your enemies as they attempt to scale a mountain and destroy your temple. Slingshot Island is essentially a 3D version of Angry Birds, in that you use a slingshot to knock stuff over. It’s significantly less charming, though, and the physics are painfully slow. (However, I sat there for 10 minutes trying to knock down pillars before realizing that I could stand up and walk around the room to take shots at different angles. What a luddite.)
One of the better games, Domino World, is also one of the simplest. You pick an area, tap and drag your finger all around to create your domino line, and then you click a button to start the, er, domino effect. The physics are about as true to life as you can get, although like most Tango apps, the line of dominoes we set didn’t always acknowledge other objects on the tabletop, and as we walked around the room, they didn’t “stick” precisely where we laid them.
The same is true of the racing games, which include Car Racing For Tango (CRT), Wild Wild Race, and Hot Wheels Track Builder. The first two offer basically the same thing: You pick a spot, build a track, and then race cars on them. It’s all a top-down, tabletop perspective. We found it difficult to place the pieces of track--the device threw up constant warnings that it couldn’t detect the surface, or we were too close to the previous piece of track, or what have you. Once the tracks are built, though, it’s fun to race your car, and because you have a 3D track, you can walk around it to see things at different angles even as you drive.
Hot Wheels was a different animal, and it was one of the games we enjoyed most. Instead of superimposing a track on an existing surface, the app has its own 3D environment. It doesn’t learn your room, so you have to be aware of where you’re walking, but it does give you a big workspace in which to build tracks. The app gives you little flexibility--the tracks are mostly preset, but you can alter them as you earn more types of track pieces, like loop-the-loops and accelerators--but that helps keep the experience clean and precise. In each little stage, you have to launch your Hot Wheels car from one end of the stunt track to the finish line. It’s addictive.
Titles like Ghostly Mansion took the same approach to 3D that Hot Wheels does; the game puts you into a creepy 3D room--but it’s not generated based on your surroundings. So, even though you can walk all around the virtual room, poking at bookshelves and poring through drawers to find a handful of objects you need to complete the level, you also have to avoid your real-world furniture, walls, and so on. Again, here, drift is a problem. You’ll end up traipsing through several physical rooms as you attempt to navigate the lone virtual one.
Several games were clearly just meant to entertain children, like:
-Crayola Color Blaster, in which you splash color onto black and white “zombies”-Bubbles, in which you blow into the phone speaker to...blow bubbles-Dinosaurs Among Us, in which you place holograms of various dinos in your physical space and can view details about them-Bugs!Bugs!Bugs!, which is a sort of tower defense game in which you tap the screen to squish bugs-Danger Dodgers, in which you direct a--llama, I think it is?--away from falling rocks
...and so on.
There are, of course, a number of apps that are neither games nor dev tools. Most of them are aimed at shopping and home improvement, which in my case, as mentioned above, turned out to be ideal, because I just moved into an older house that requires a full-scale renovation. Thus, there are reno projects aplenty, and new spaces that are begging for furniture, flooring, etc. that I don’t yet have.
The Lowe's Vision app was among the most handy for me. Back to the dining room: The trim and walls are painted, but the flooring needs replaced, and I'm thinking of putting up a chair rail. Also, there’s not much furniture in there yet. Using the app, I was able to browse Lowes’ catalog of materials to superimpose different types of flooring, chair rail samples, and even furniture onto the physical room.
It should be noted, though, that even this more or less solid app is arduous to use at times. As we noted above, Tango wants your space to be clean and clear, brightly lit, and with solid lines. Although I wanted to lay down some virtual flooring and see what the chair rail might look like and also drop in a piece of furniture, it took me multiple attempts to get just those three items lined up. Even then, the trio didn’t want to persist in their places or stay related to one another in the space.
The Lowe's app was ideal for me because it has construction materials, but there are several apps that do essentially the same thing for furniture and home goods. Mostly, they also have a web store attached, and you can select products and see how they look in your space.
Gap Dressing Room, Amazon Product, and Apollo Box fall into this category somewhat, although they’re more focused on clothing, TVs (yeah, just TVs), and tchotchkes, respectively.
The Gap app doesn’t actually need area learning, though; it just lets you pick mannequins that should more or less approximate your body size and type, and then you can see how different sizes of an item of clothing might look on you--or, rather, on a mannequin that’s roughly your size.
The Amazon Product app seems like a test for Amazon; although the etailer behemoth has a deep, vast stable of products, the app is only for TVs. However, this narrow application makes sense; you want to be able to see what a given TV might look like in a given space, and this spares you the need to measure your room and dig up the TV's specs to compare.
In sum: We never did get a few of the apps to run properly--crashes and bugs abounded on those--and others just didn’t quite deliver on their promise. Yet others, though, performed admirably and were quite impressive.
Overall, the implementation feels like it has plenty of room for improvement. There needs to be less heat, better tracking persistence, fewer crashes, and fewer phone reboots.
I’d be remiss if I didn’t also note that over the course of months, using the Phab 2 Pro as my daily smartphone, I barely ever used any of the AR features unless I was testing them out for the purpose of crafting this article. Granted, I’m admittedly a Very Boring Old Person who almost never plays any mobile games; certainly, other users could get sucked into to some of these Tango games and play them all the way through. I did make occasional use of some of the home renovation apps, but even so, 1) I didn’t use them often and 2) I’ll only be renovating my home for so long (I hope).
What I, personally, really want is more productivity apps. And that may be a reason to decouple the Tango tech from the phone.
As we said at the top of this article, Tango on a smartphone generally, and the Lenovo Phab 2 Pro specifically, represents a necessary half measure in the consumer-ready AR (and MR) market. A smartphone offers what amounts to a tiny and grating field of view (FoV), and in order to interact with anything, your big fingers occlude far too much of the screen. Worse, the phone is just so heavy that it’s difficult to hold in one hand and tap or swipe or what-have-you with the other.
What Tango needs is AR glasses that offer hand and finger tracking for input, or a Daydream-like controller at the very least. (Gaze and voice input would be a huge bonus.) In our opinion, a far better use of a smartphone-plus-Tango would be to mount the Tango hardware on AR glasses, connect the glasses to the phone with a cable (or wirelessly, possibly), and use the phone’s powerful processing capabilities.
This perhaps represents a step backward, because in that paradigm, you have to carry around a second device--the glasses--and those glasses would have to have the slightly bulky Tango camera assembly on board.
Looked at another way, though, such a setup could be a spiritual sibling to Google’s Daydream. In fact, hardware makers could sell a smartphone with Daydream VR HMD and Tango AR glasses together or separately, and you could use the same smartphone to power one or the other depending on your needed use case.
There is an argument to be made that part of the allure of AR for consumers is that it’s available to you at any time if it’s embedded in your smartphone. Need to measure a room? Want to play a fun game while you’re at the park? At the store and want to check something about a product? Just whip out the phone that’s already in your pocket. Further, we’ve seen from the smartwatch industry that having to remember and bring along yet another device has serious drawbacks from a consumer adoption standpoint.
It’s a fair point; but considering the cumbersome nature and sub-optimal user experience of using AR on a smartphone, we would posit that the tradeoff isn’t worth it--yet. If the Tango portion of the camera assembly can be shrunk down sufficiently that it doesn’t produce so much heat and can be mounted on a much smaller handset, sure, there’s value in the embedded approach. In the meantime, though, we believe that experiencing Tango’s AR capabilities on a pair of glasses that are tethered to a phone (in your pocket) is the next necessary step in Tango’s evolution.
This is not to impugn Lenovo. The company (along with Asus) should be lauded for having the courage to produce a device with no guarantee of success. Now that we’ve had a chance to see what a Tango-enabled smartphone like the Phab 2 Pro can and can’t do, the road forward looks more clear. Tango on the Phab 2 Pro is a half measure, and in some ways so are the apps, but they’re all necessary ones, pointing towards and leading up to new and compelling technology and experiences.
I would point out that the depth sensor aids in tracking & area learning more than you probably think. Neither of those functions have been demonstrated to work as robustly on a standard smart phone.
I also wanted to point out that it's standard for SLAM systems to use the IMU (compass, accelerometer, and gyroscope) for coarse tracking, and then refine the estimate by registering with features seen by the camera(s).
Anyhow, I'd be interested in knowing whether your tracking stability could be improved by hanging some posters and putting down some other landmarks in your dining room. While it might like clean, flat surfaces, it doesn't actually want or need them to be featureless. The old tablet actually had a fair amount of difficulty tracking in an empty, white room.
IMO, they need to fund some game devs to make more AR games, or to utilize Tango features in existing hit games. Ultimately, it might be Facebook & Microsoft that make AR happen. It feels like Google still isn't throwing a huge amount of weight behind this.
Again, this is spot on. They deserve a lot of credit for braving the waters and being first to take the plunge.
I think you're correct that AR smartphones are ultimately just a transitional phase that I hope will pass quickly. Maybe, once AR apps & tech is sufficiently refined, the phone-based AR experience won't be quite so painful, and it'll be a usable substitute for a HMD.
Relying only on the ARM SoC means lots of heat and power consumed. Any serious AR effort (at least with CURRENT mobile SoCs) should probably include custom hardware to greatly offload work from the CPU/GPU blocks.
I'm still not convinced the HPU in MS' Hololens is really better than a mobile GPU with whatever custom instructions they put in its DSP blocks. Clearly, they needed more horsepower than Cherry Trail's GPU could supply, but I think GPUs generally offer a good solution. Again, perhaps with the addition of a few custom instructions or custom blocks.