Skip to main content

Microsoft Taught AI How To Beat 'Ms. Pacman'

Ms. Pacman doesn't seem like a hard game. All you have to do is guide its titular character through mazes to gobble up pellets and avoid a group of ghosts. Underneath that veneer of simplicity, however, is a complicated game that's surprisingly well-suited to helping AI learn how to think like humans do. Or at least that's what Microsoft said today when it announced that it made an AI capable of getting the game's highest possible score.

AI is no stranger to playing games. (And no, we aren't referring to the AI from "WarGames.") Perhaps the most famous example is Google's AlphaGo, which defeated grand master Go players before it retired in May. Topping a decades-old arcade game's scoreboard doesn't seem quite as exciting--it's just a machine playing against a machine, after all--but Microsoft still hailed the AI's achievement.

That's because the company took a novel approach to teaching its AI how to master Ms. Pacman. It gave 150 "agents" a specific task, such as eating a certain pellet or avoiding a ghost, and then created a "top agent" that decided how to move based on the other agents' feedback. The AI balanced each individual agents' desire to accomplish a certain goal with the top agent's mission to get the maximum score of 999,990.

Microsoft described that decision making in its announcement:

The top agent took into account how many agents advocated for going in a certain direction, but it also looked at the intensity with which they wanted to make that move. For example, if 100 agents wanted to go right because that was the best path to their pellet, but three wanted to go left because there was a deadly ghost to the right, it would give more weight to the ones who had noticed the ghost and go left.

That's similar to how many of us think. We collect information, decide how important it is, and then act based on that judgment. That skill is key to many games, including classics like Ms. Pacman. There's often so much happening on-screen at any given time that responding to each stimuli would be impossible. Instead, you have to make quick decisions based on the information at hand and hope that you made the right choice.

The system was also taught via reinforcement learning. This means it was given a positive or negative response for every action, then told to figure out how to get the most positive responses. Instead of teaching the AI by showing it how pro players approach Ms. Pacman--a process called supervised learning--the AI had to figure things out for itself. (It's kind of like having your kids solve a problem instead of solving it for them.)

Just like Google's work on AlphaGo, teaching AI to become a Ms. Pacman whiz isn't Microsoft's end goal. Instead, the company said that the approach learned from this experiment could be used to train AI how to handle other tasks, such as managing schedules or improving natural language processing. The company shared what it learned in a paper titled "Hybrid Reward Architecture for Reinforcement Learning."

This effort highlights the value that games offer to AI research. Many of them come naturally to us, but teaching computers how to play them as well as AlphaGo played Go or Microsoft's AI played Ms. Pacman is much harder. The potential for mastery is there, as both of these AI proved, but the journey is the most exciting part. Games also make it easy for us to understand how far AI has come. Knowing an AI can recognize cats isn't impressive--watching it dominate a game shortly after it came into existence, let alone learned how to play the game, is much cooler.