It has been claimed that Mankind's last great invention will be the first self-replicating intelligent machine. The Hollywood cliché that artificial intelligence will take over the world could soon become scientific reality as AI matches then surpasses human intelligence. Each year AI’s cognitive speed and power doubles — ours does not. Corporations and government agencies are pouring billions into achieving AI’s Holy Grail — human-level intelligence. Scientists argue that AI that advanced will have survival drives much like our own. Can we share the planet with it and survive?
Here are the critical points Barrat explores:
Intelligence explosion this century. We’ve already created machines that are better than humans at chess and many other tasks. At some point, probably this century, we’ll create machines that are as skilled at AI research as humans are. At that point, they will be able to improve their own capabilities very quickly. (Imagine 10,000 Geoff Hintons doing AI research around the clock, without any need to rest, write grants, or do anything else.) These machines will thus jump from roughly human-level general intelligence to vastly superhuman general intelligence in a matter of days, weeks or years (it’s hard to predict the exact rate of self-improvement). Scholarly references: Chalmers (2010); Muehlhauser & Salamon (2013); Muehlhauser (2013); Yudkowsky (2013).
The power of superintelligence. Humans steer the future not because we’re the strongest or fastest but because we’re the smartest. Once machines are smarter than we are, they will be steering the future rather than us. We can’t constrain a superintelligence indefinitely: that would be like chimps trying to keep humans in a bamboo cage. In the end, if vastly smarter beings have different goals than you do, you’ve already lost.
Superintelligence does not imply benevolence. In AI, “intelligence” just means something like “the ability to efficiently achieve one’s goals in a variety of complex and novel environments.” Hence, intelligence can be applied to just about any set of goals: to play chess, to drive a car, to make money on the stock market, to calculate digits of pi, or anything else. Therefore, by default a machine superintelligence won’t happen to share our goals: it might just be really, really good at maximizing ExxonMobil’s stock price, or calculating digits of pi, or whatever it was designed to do. As Theodore Roosevelt said, “To educate [someone] in mind and not in morals is to educate a menace to society.”
Convergent instrumental goals. A few specific “instrumental” goals (means to ends) are implied by almost any set of “final” goals. If you want to fill the galaxy with happy sentient beings, you’ll first need to gather a lot of resources, protect yourself from threats, improve yourself so as to achieve your goals more efficiently, and so on. That’s also true if you just want to calculate as many digits of pi as you can, or if you want to maximize ExxonMobil’s stock price. Superintelligent machines are dangerous to humans — not because they’ll angrily rebel against us — rather, the problem is that for almost any set of goals they might have, it’ll be instrumentally useful for them to use our resources to achieve those goals. As Yudkowsky put it, “The AI does not love you, nor does it hate you, but you are made of atoms it can use for something else.”
Humans values are complex. Our idealized values — i.e., not what we want right now, but what we would want if we had more time to think about our values, resolve contradictions in our values, and so on — are probably quite complex. Cognitive scientists have shown that we don’t care just about pleasure or personal happiness; rather, our brains are built with “a thousand shards of desire.” As such, we can’t give an AI our values just by telling it to “maximize human pleasure” or anything so simple as that. If we try to hand-code the AI’s values, we’ll probably miss something that we didn’t realize we cared about.
In addition to being complex, our values appear to be “fragile” in the following sense: there are some features of our values such that, if we leave them out or get them wrong, the future contains nearly 0% of what we value rather than 99% of what we value. For example, if we get a superintelligent machine to maximize what we value except that we don’t specify consciousness properly, then the future would be filled with minds processing information and doing things but there would be “nobody home.” Or if we get a superintelligent machine to maximize everything we value except that we don’t specify our value for novelty properly, then the future could be filled with minds experiencing the exact same “optimal” experience over and over again, like Mario grabbing the level-end flag on a continuous loop for a trillion years, instead of endless happy adventure.
Image Credit: With thanks to http://blogs.ifsworld.com