Kinect In Depth
Microsoft Kinect, which uses machine vision technology to create a new class of game peripheral, sold 8 million units in sixty days after launch. That makes Kinect one of the fastest-selling consumer electronics gizmos of all time. Whereas Nintendo simplified and extended the game controller, Kinect removed the controller entirely, allowing a gamer tocontrol a title with gestures and body movements. But Kinect goes beyond gestures, incorporating the gamer’s own body into the game itself.
The Kinect hardware is surprisingly simple. Inside the Kinect sensor pod are an RGB and an IR depth sensor, a microphone array, and a tilt sensor. In addition, a motor is built into the pod, allowing Kinect to tilt plus or minus 27 degrees.
The depth sensor consists of an IR emitter and an IR camera. The camera knows where the emitter is located, and can sense the similarity and difference pattern based on the overlap of the cameras viewing angle.
Kinect can track two simultaneous players, but watches six “player proposals,” which help Kinect-enabled titles know when a new person has moved into the tracking area. Body tracking is done via a skeletal tracking system, which assigns joint data to the motion information captured. Data from the depth sensor builds a depth map, which is used to map both moving and fixed objects.
If you’ve ever played or watched people play the Kinect dance game Dance Central, you’ve seen the animated depth map, which is a tiny window that gives cues to the player about their moves.
The depth map format itself is pretty low-resolution (only 320x240 pixels), but 16 bits per pixel. The first 13 bits track depth in millimeters, from 800 mm to 4000 mm, which corresponds to the Kinect play area. The last three bits identify the skeleton being tracked, referred to as the segmentation index.
Kinect Play Space
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
One of the challenges with Kinect is tracking the entire play space. The play area ranges from 0.8 m (2.6 ft.) to 4 m (13.1 ft.), but the “sweet spot” is right at 2.26 m (7.4 ft.) for single players and 2.5 m (8.2 ft.) for two players. Within that area, Kinect needs to keep track of one or two moving bodies. The trick on the part of the developer is to frame the player, as this Microsoft slide shows.
The tilt motor can be controlled by the application, and judicious use of tilt helps Kinect games keep track of players as they move closer to or further from the camera.
Kinect Launch Title Notes: Dance Central
I attended a postmortem given by Matt Boch and Dean Tate of Harmonix.
Dance Central has proven to be one of the most popular launch titles for Kinect. More impressive, Harmonix developed Dance Central in just 17 months, with full production taking just twelve months. Much of the early prototyping took place before Kinect was even announced. Kinect’s announcement (then Project Natal) gave Harmonix the technology it needed to implement its vision.
One of the most interesting parts of the game is what, in any other game, would be called the tutorial. Known as “Break it Down mode,” the learning section of Dance Central is a game in its own right, ramping up challenges very slowly and walking players through new moves at their own pace, but in a far more engaging manner than a dry tutorial. All of the moves in Dance Central were designed by professional choreographers, and tested by both experienced dancers and “old, fat white guys” to see how well they’d work in a real situation.
While Boch and Tate didn’t discuss Kinect in much detail, they did note that they never had to dumb down any dance moves in order to make the game work with Kinect. As long as the players were in the Kinect tracking area, dance moves could be complex and fast.
Other Kinect sessions at GDC included Kinect audio (especially the array microphone), Kinect gesture detection, and how Kinect can ID particular players. I wasn’t able to attend those sessions, but video and slides of those talks will likely go up at the GDC site at a later date.