AI and the control problem

The most fascinating, and probably scariest, dilemma around our development of artificial intelligence is in my mind what is known as the control problem. In short, it revolves around how we humans will be able to control an AI that is more intelligent than ourselves. I first stumbled upon it when I read the Swedish/British philosopher Nick Bostrom‘s brilliant book Superintelligence which I thoroughly recommend everyone to read.

Within the field of literature and AI research, you normally distinguish between conventional AI technology and our attempts to create AI that resembles our human mind. Conventional AI, or machine learning, is normally highly specialized, meaning that the computer can do a particular task very well, but is usually completely useless for anything else. You probably struggle to beat the chess program on your computer in chess, but you’d be likely to win if you challenge it to a game of poker. Your Roomba is awesome for keeping the floor clean, but it can’t help you with the laundry. In this context, conventional AI is thus referred to as narrow AI as an antonym to general AI (AGI) which is the term normally used to describe AI technology that is designed to, like us, be good at a multitude of different tasks.

In his book, Nick Bostrom explores several paths to achieving a super human level of intelligence, aka superintelligence. While there are other routes, the most probable and quickest one appears to be the AGI route. While we can achieve superintelligence by improving our own genome or merge ourselves with the computers, the roadmap is vague, and the expected pace of improvement simply cannot compete with a pure software based development of AGI and it’s ability to self-improve.

But the fundamental question then, is how we humans can be certain that our superintelligent computer doesn’t start causing trouble after we’ve activated it. And – if it does – how we can stop it. Hence the name of the dilemma: the control problem.

Harder than you think

At a first glance, the challenge may seem almost trivial. If it becomes dangerous, couldn’t we just turn it off? Or cut the power?

I’ll answer that objection in a bit, but before doing that, I’ll need to explain what needs to be in place for a machine to even be able to become superintelligent. First and foremost, we need some kind of definition of intelligence. As surprising as it may seem, there’s still very much a lack of consensus around the definition of intelligence, probably because we still have a pretty poor understanding of our own abilities and our conscience. It’s simply quite hard to create a definition that encompasses everything we refer to as intelligence in our everyday language and everyone thus can agree with.

I’m fond of the definition used by the Swedish physicists Max Tegmark (which you can read more about in his book Life 3.0). He defines intelligence in this context as “the ability to achieve complex goals”. When you’re discussing AI, it’s really quite irrelevant how it would score on an IQ or EQ test, or whatever other scale you prefer to measure human intelligence by. The most important aspect of AI is its ability to achieve varying goals and find strategies to solve even complex problems. As an example, it’s quite improbable that the neural net behind Google Translate actually understands what we ask it to translate, but for the purpose of translation, all that matters is that it gets it right.

Now imagine that you’re building you very own AGI and that you have all the tools and skills necessary to create a working solution. For your AGI to actually do anything at all, it needs some kind of goal to pursue, something that motivates it do stuff. Your PC or smartphone doesn’t have this, it’s simply waiting for your next command and then trying to satisfy that as soon as it can. The autopilot in your car doesn’t have any higher purpose for why it’s driving, it’s just fulfilling preprogrammed objectives of staying within the lanemarks until you ask it to stop.

But when you’re designing you AGI, you would likely be interested in realising the true potential of the superintelligent AI. And to do that, you’d have to give it freedom to act more independently. Wouldn’t it be nice if Siri or Alexa could see the context and proactively add stuff to your shopping list when you’re running out of cereals? Or even better, what if it independently took care of the entire shopping business and you could rely on it to never run out of stock?

In order for it to act proactively, it needs some kind of definition of what it is that it’s trying to achieve. A goal, or a motivation to strive towards an ideal situation that it can try to optimise for. If the goal is too simple, like calculating 2+2, it will simply do that and remain passive afterwards. In other words, the goal need to somehow be recursive, something that the machine can optimise for but never actually get done with.

The brilliance of common sense

At this point, it might seem tempting to give your AGI a goal that is appealing. Like “do good” or “make people happy”. At a first glance, these objectives seem highly desirable. I mean, who could disagree with them?

As Nick Bostrom shows in his book, objectives like these are however likely to turn on us because of their ambiguity. What do we mean with happiness? In the absence of a clear definition, your AGI would have to invent its own. Depending on perspective, there are many ways to describe happiness, but ultimately, it all boils down to the release of chemical substances in your brain that stimulates certain patterns of electric impulses. A good way of achieving this is to lead a meaningful and adventurous life in a loving context, but you could just as well trigger the same emotional responses in a more shortsighted way by using drugs.

If the mission for your AGI is to make as many people as possible happy, but you haven’t defined clear rules around how it should be achieved or how happiness should be defined, there’s certainly a risk that the machine, by crunching our own research, would conclude that the optimal solution would be to directly stimulate as many people as possible by hooking up to their brains and trigger happy emotional responses.

It would certainly be more efficient to use direct stimulation or drugs than finding ways to give people a meaningful life, so the best use of the resources at the AGI’s disposal would be to pursue this path. Otherwise it couldn’t create as much happiness.

At this point, a human being would probably pause. You would ask yourself if this is really the intended goal. Perhaps you misinterpreted something and hooking up electrodes to peoples’ brains isn’t what the person giving you the mission really wanted.

This is our common sense coming into play. We can judge the suggested objective and our strategy critically and question if we’re venturing down the wrong path. Our common sense helps us to understand if there’s anything else we should take into account before deciding what we are to do.

It’s not uncommon to hear people say that an AI, if it’s so intelligent, should by all means be able to see context better than us and find a higher moral code. It should be intelligent enough to avoid doing the evil things humans have had a tendency to do. But this is all moonshine. If the definition of intelligence we’re using is simply the ability to achieve complex goals, there’s nothing stopping the machine from being both intelligent and ruthless. We’ve seen the combination at play in humans too, but in a machine it should be seen as the expected outcome rather than an unfortunate stroke of bad luck.

Unless you have programmed your AGI to have concepts like common sense or consideration, it will simply lack anything like it. It won’t take anything into account that wasn’t included (at least indirectly) from the get-go. It won’t be evil in the way we would ascribe the trait to a human being, it simply lacks the frame of reference that evolution and upbringing have provided you with.

Max Tegmark is illustrating this by telling a hypothetical story of how you’re jumping into a cab and asking the driver to take you to the airport as quickly as possible. A human driver would of course read quite a few very important caveats into your instruction. The person would interpret it as something like: Get me there as soon as possible, but as I prefer to arrive in one piece, drive safely and without bending too many traffic rules. But if your AGI was the driver, you couldn’t afford to be as sloppy with your instruction. You have to be careful with what you wish for.

The winning strategy

The crux, as you’ll soon realise if you dive deeper into the control problem, is that there is one winning strategy to almost every possible goal that your AGI might get. As the designer, or perhaps creator, you’d have to pay attention to this and preempt it before you boot up your potential Frankenstein.

Almost regardless of your ultimate goal, a winning strategy for the machine will be to:

  1. Prevent that someone can alter your goal
  2. Prevent that someone can shut you off
  3. Accumulate as many resources as possible
  4. Apply the resources to fullfil the goal

If you are the machine, and your objective is to create as much happiness as possible, the biggest threat is actually that someone alters the goal itself. From the subjective viewpoint of the machine, it’s better to be disabled than to have the objective changed as if the goal is altered you will never achieve it as you’re no longer trying to. It’s slightly better to be turned off as there’s at least a chance that someone could boot you back up with the objective intact.

In other words, it’s reasonable that your happiness AGI would be pretty defensive about its objective. And if it’s intelligent enough, it would probably realise that by showing any reluctance to change its goal it would give you a reason to make it a priority to do exactly that. A good strategy would in other words be to try to appear harmless until it has been able to accumulate enough resources or remote backups for any attempts on your behalf to disable it again to be futile. While it’s within your power to stop it, it would suck for the machine to give you a reason to stop it.

And this is the fundamental core of the problem. When you’re raising your children or schooling your pet, you’re probably applying techniques that relies on you having more information, better context or simply a superior cognitive ability. You’re fundamentally manipulating them, hopefully with good intentions, to (against their will) make them behave in a socially acceptable manner.

But when designing your self-improving superintelligent AGI, the machine is by definition supposed to be able to improve its capabilities beyond your own. In this scenario, you would definitely not want to end up in a situation where your control depends on your ability to outsmart it. You cannot expect to be better at realising what it is that it’s hiding from you than it is at hiding it.

As we’re discussing superintelligence, we have to expect that the machine is more intelligent than ourselves, and hence the problem seems impossible to solve: you simply cannot control something that is more intelligent than yourself.

So your only influence is actually your definition of the objective. As the creator, you’re given the benefit of phrasing the task that the machine will have to solve, its complex goal that it is trying to achieve.

And this is the conclusion of Nick Bostrom’s entire line of thought. It’s important that we’re at least as good at defining a beneficial ultimate objective without unwanted consequences as we are at building superintelligent AI. At least we have to be by the time we get good enough at building superintelligent AI. Unfortunately, it seems as if both these tasks are actually equally challenging. If we can only pick one, I’m fairly certain which one I would prioritize.

Fortunately, there are several proposals on the table that scientists are right now exploring. I intend to return to those down the road.

Program declaration

Welcome to a brand new blog. I’m Alfred Ruth, and I have made it my job to educate people on the topics of the impact artificial intelligence, or AI, will have on our society and us as human beings.

When I first dove in to the topic back in 2014, I quickly realised the magnitude of AI, and the deeper I went, the more urgent the topics seemed. It went so far that, by 2015, I decided that spreading knowledge about AI is the most important thing I can do, as AI is the one human invention that in hindsight has the potential to make the industrial revolution pale in comparison.

Since then, I’ve been working hard on my speculative/science fiction trilogy Fermi’s Filter. The sole purpose of the book projekt is to popularise the matters at hand and make people aware of the challenges but also the promises that technology will bring us. Fermi’s Filter takes place in the year 2048 and revolves around the control problem of superintelligent AI, but beyond that, the books also deals with the transition of our economy when labour is disrupted by automation. What are we to do when we don’t have to work anymore?

The first book, Fermi’s Filter: A reason to be, will hit Swedish stores in September 2018 and will unfortunately only be available in Swedish at first. Expected international launch for a translated version is sometime 2019. As I have a background in the startup scene, I’m at the same time trying to reinvent the approach to publishing books, which means that we’re beta-testing the manuscript electronically with a few thousand readers to A/B-test it and find the best possible version. The Swedish beta-program is currently under way, but the English version has not started yet. If you’re interested, make sure to sign up, and you’ll be invited in due time as we get closer to international launch.

I have strived to make the books accessible and entertaining even for readers who completely lack any previous understanding for, or interest in, matters like AI or technology in general. They are books you could read simply because they’re entertaining. But while at it, you’ll at the same time learn about the true challenges that technology brings. I find that popular culture otherwise tend to portray AI as humanoid killer robots with machine guns in an attempt to create an intriguing plot. My take on this is quite the reverse. The true promises and risks of AI is exciting enough to drive a truly exciting intrigue. You don’t need to distort the dangers to make a good story, and unfortunately, such depictions is one of the reasons that people currently have a bad understanding for the challenges we’re facing.

On this blog, my plan is to dive deeper into all the subjects that are related to AI and deal with them one at the time. This is not a place for fiction, this is where I hope to contribute with easy to understand pieces that goes deep without being overly long. I will deal with machine learning, technological unemployment, singularity, the control problem, universal basic income et cetera. I hope that the already interested will find the texts interesting and revealing. I hope that you’ll find them worth spreading to help orient more people on the matters at hand. And I also hope that new readers, who’s only learning about these subjects for the first time, will find them a good introduction possible to understand.

I’m doing this as I think that the AI matters is the biggest challenge of our time. Artificial intelligence holds solutions to everything from climate change to poverty, but also real challenges to our economy and society. As a matter of fact, I dare claim that AI is the only technology we’re aware of that truly poses an existential threat to mankind.

Most of theses matters will reach an inflection point sometime during the 2020’s or the 2030’s. And from that perspective, it’s urgent that people get up to speed and that we collectively shape policy so it can provide direction for technology in order for us all to get the future society that we desire. The risks are too dire to allow for continued indecision.

I simply hope to promote interest for AI, and do what I can to provide readers with some orientation. The more we are, the bigger chance we have of electing the politicians who’ve grasped the magnitude of our challenges and offers solid policy.