Deep Reinforcement Learning: Are We Sure How It Works?

By: Daniel Newman

Photo Credit: martinlouis2212 Flickr via Compfight cc

Short answer: No. Just weeks ago, we witnessed Facebook shut down two AI chatbots after they created a language even their operators didn’t understand. The bots were part of a project that was supposed to help us learn more about bot negotiation skills. What we really learned: even the best minds in the industry still aren’t quite sure how to ensure that artificial intelligence (AI) develops as they intend. But that doesn’t mean we should stop trying.

Some of the greatest tech minds in the world, including those at Google DeepMind and IBM, are gaining ground in the deep reinforcement learning field. With its help, AI bots of the future will no longer be limited to programmed knowledge and skills sets. They’ll be able to learn as they go—on their own—with much smaller amounts of information.

In the past, machine learning has been the mainstay of AI research. For instance, Facebook and Google regularly use machine learning to teach programs how to recognize faces, objects, voices, and words—so much so that programmed bots have become part of our daily lives. They help recommend music on Spotify. They help alert us to new releases we’ll enjoy on Netflix. They help us process the growing amounts of information we are faced with every single day. The downside of machine learning? It takes tons of time—both on the part of the programmer and the part of the machine as it works through upwards of 15 million pieces of data to learn any one specific association.

Deep reinforcement learning, on the other hand, goes far beyond classifying information. If mastered, it can help make decisions—adjust its behavior based on its operator’s moods—and even anticipate how to make our lives easier based on external factors. The only challenge: getting to that next level. In some cases, it’s worked. The often—perhaps too often—cited example of DeepMind’s AlphaGo beating a high-ranking Go player is just one bot that has used reinforcement learning to master a complex game. But games are one thing; mastering tasks and decision-making in a world filled with irrational human behavior is another.

So, is it even possible to give bots these “human” skills? Experts believe it is. One example from IBM puts it into perspective: it would seem impossible to make a human with bird characteristics. But somehow, with the help of technology, we have learned to fly. It is hoped that the power of reinforcement learning could help transfer human qualities to non-human machines. The challenges?

First, for any bot to be useful for a human—such as the long-term goal of utilizing an AI bot assistant—the bot would need to be able to function accurately amidst changing minds and emotions. That isn’t easy. Despite the progress we’ve seen in self-driving cars, for instance, they struggle when driving amongst humans, rather than one another.

Second, the amount of data we experience daily continues to grow. Developers hope reinforcement learning will solve this problem by allowing bots to process information via smaller data sets, which will save time and operator support. But as more data is created, will reinforcement learning be able to keep up with the influx, especially when it comes to unstructured data?

One thing is for sure: digital transformation has created a need for many new technologies, and AI is one of them. Machine learning helped answer the call for processing large amounts of structured information, but real life isn’t structured. It’s messy and busy and complicated, and I wrote in June on Converge, the next level of reinforcement learning needs to be smarter to keep up. If it can, we will experience numerous benefits—not just in our personal lives, but in customer service, security, and nearly every other aspect of human life. Perhaps that’s why so many companies are in the AI game, as my colleague Shelly Kramer shared last year in her piece, “The Booming AI Market: Who’s In? Everybody.”

In the end, however, we all have a lot to learn when it comes to reinforcement learning. In a way, we are just like the bots—using trial and error to find our way through unchartered territory. Time will tell how far we are able to go.

Topics: ,

About The Author

Daniel Newman

Founder and President, Broadsuite, Inc.

After 12 years of running technology companies including a CEO appointment at the age of 28, I traded the corner office for a chance to drive the discussion on how the digital economy is going to forever change the way business is done. I'm an MBA, adjunct business professor and 4x author of best-selling business... Read More