Reading a Microsoft blog post about a new artificial intelligence designed to play the perfect game of Ms. Pac-Man, I couldn’t help but think about playing a tabletop version of Atari’s classic video game as a kid in the entertainment centre of a hotel in Brandon, Manitoba, which happened to be next to the pool.
My sister and I would swim in the pool, then get out and run across the deck into the games room, dripping wet. We’d plug quarters into the machine, sit down to play, and –surprise, surprise – start receiving tiny electrical shocks whenever our wet fingers touched the metal part of the joystick just below the red knob.
The funny thing is, that didn’t keep us from playing. Every time our hands slipped down we’d get a little jolt of pain, but we’d keep going. Our brains registered the negative feedback, but categorized it as minor (i.e. non-life threatening) and overrode the impulse to stop in favour of the pleasure of playing and the desire to win.
A similar type of human-like intelligence seems to be at work in an AI created by Maluuba – a deep-learning start-up based in Waterloo, Ontario that was acquired by Microsoft earlier this year – which has been designed to achieve the maximum score of 999,990 playing the Atari 2600 version of Ms. Pac-Man.
The AI uses a “divide and conquer” method that involves some 150 separate AI agents working on distinct tasks. Each one is assigned as specific objective, such as eating a particular pellet. But all of the agents’ objectives – and their strategies to achieve them – are routed through a single top agent that takes a broader view and chooses a course of action. The top agent doesn’t choose how to act based on the majority of agents’ recommendations – it’s not going to steer Ms. Pac-Man into an approaching ghost just because it’s the shortest route to a bunch of pellets – but rather what’s most likely to achieve the ultimate objective of winning the game.
In my mind it’s kind of like a prime minister listening to advisors with different viewpoints before determining how to proceed based on a broader understanding of multiple problems.
This type of AI learning is called reinforcement learning, in which individual agents receive positive and negative feedback based on the choices they make. They’re programmed to try to achieve more positive feedback than negative, but the top agent makes the final call – meaning some agents’ desires go temporarily unfulfilled in pursuit of a grander goal.
Games are used in AI research because they tend to require human-style decision making, and Ms. Pac-Man is particularly well suited for AI learning due to the unpredictability of its constantly changing game situations. But the potential applications for AI trained on games go far beyond achieving high scores. As the Microsoft post points out, an AI that cuts its teeth on Ms. Pac-Man could go on to make complex decisions in business environments – such as coming up with call lists for sales executives by prioritizing clients based on known information about their histories and schedules – that end up saving human resources valuable time.
As for me, I’m keen to know what decision Maluuba’s AI master agent would have made in the position I found myself in as a kid receiving negative feedback in the form of electric shocks. What would it have deemed more important: avoiding minor physical discomfort, or enduring the pain to gobble up that last pellet to get to the next level?
I didn’t end up scoring anywhere close to 999,990 that day back in the 1980s, but I like to think Maluuba’s Ms. Pac-Man-crushing AI would nonetheless have approved of my choice to endure.
Share your thoughts
Breakdown: Buffett invests in Home Capital 2:08