Tuesday, February 17, 2015

#9 Why the Future Might Be Our Worst Nightmare

One of the reasons I wanted to learn about AI is that the topic of “bad robots” always confused me. All the movies about evil robots seemed fully unrealistic, and I couldn’t really understand how there could be a real-life situation where AI was actually dangerous. Robots are made by us, so why would we design them in a way where something negative could ever happen? Wouldn’t we build in plenty of safeguards? Couldn’t we just cut off an AI system’s power supply at any time and shut it down? Why would a robot want to do something bad anyway? Why would a robot “want” anything in the first place? I was highly skeptical. But then I kept hearing really smart people talking about it…


Those people tended to be somewhere in here:

So what is it exactly that makes everyone on Anxious Avenue so anxious?

Well first, in a broad sense, when it comes to developing supersmart AI, we’re creating something that will probably change everything, but in totally uncharted territory, and we have no idea what will happen when we get there. Scientist Danny Hillis compares what’s happening to that point “when single-celled organisms were turning into multi-celled organisms. We are amoebas and we can’t figure out what the hell this thing is that we’re creating.”14 Nick Bostrom worries that creating something smarter than you is a basic Darwinian error, and compares the excitement about it to sparrows in a nest deciding to adopt a baby owl so it’ll help them and protect them once it grows up—while ignoring the urgent cries from a few sparrows who wonder if that’s necessarily a good idea…15

And when you combine “unchartered, not-well-understood territory” with “this should have a major impact when it happens,” you open the door to the scariest two words in the English language:

Existential risk.

An existential risk is something that can have a permanent devastating effect on humanity. Typically, existential risk means extinction. Check out this chart from a Google talk by Bostrom:


You can see that the label “existential risk” is reserved for something that spans the species, spans generations (i.e. it’s permanent) and it’s devastating or death-inducing in its consequences.13 It technically includes a situation in which all humans are permanently in a state of suffering or torture, but again, we’re usually talking about extinction. There are three things that can cause humans an existential catastrophe:

1) Nature—a large asteroid collision, an atmospheric shift that makes the air inhospitable to humans, a fatal virus or bacterial sickness that sweeps the world, etc.

2) Aliens—this is what Stephen Hawking, Carl Sagan, and so many other astronomers are scared of when they advise METI to stop broadcasting outgoing signals. They don’t want us to be the Native Americans and let all the potential European conquerors know we’re here.

3) Humans—terrorists with their hands on a weapon that could cause extinction, a catastrophic global war, humans creating something smarter than themselves hastily without thinking about it carefully first…

Bostrom points out that if #1 and #2 haven’t wiped us out so far in our first 100,000 years as a species, it’s unlikely to happen in the next century.

#3, however, terrifies him. He draws a metaphor of an urn with a bunch of marbles in it. Let’s say most of the marbles are white, a smaller number are red, and a tiny few are black. Each time humans invent something new, it’s like pulling a marble out of the urn. Most inventions are neutral or helpful to humanity—those are the white marbles. Some are harmful to humanity, like weapons of mass destruction, but they don’t cause an existential catastrophe—red marbles. If we were to ever invent something that drove us to extinction, that would be pulling out the rare black marble. We haven’t pulled out a black marble yet—you know that because you’re alive and reading this post. But Bostrom doesn’t think it’s impossible that we pull one out in the near future. If nuclear weapons, for example, were easy to make instead of extremely difficult and complex, terrorists would have bombed humanity back to the Stone Age a while ago. Nukes weren’t a black marble but they weren’t that far from it. ASI, Bostrom believes, is our strongest black marble candidate yet.14

So you’ll hear about a lot of bad potential things ASI could bring—soaring unemployment as AI takes more and more jobs,15 the human population ballooning if we do manage to figure out the aging issue,16 etc. But the only thing we should be obsessing over is the grand concern: the prospect of existential risk.

So this brings us back to our key question from earlier in the post: When ASI arrives, who or what will be in control of this vast new power, and what will their motivation be?

When it comes to what agent-motivation combos would suck, two quickly come to mind: a malicious human / group of humans / government, and a malicious ASI. So what would those look like?

A malicious human, group of humans, or government develops the first ASI and uses it to carry out their evil plans. I call this the Jafar Scenario, like when Jafar got ahold of the genie and was all annoying and tyrannical about it. So yeah—what if ISIS has a few genius engineers under its wing working feverishly on AI development? Or what if Iran or North Korea, through a stroke of luck, makes a key tweak to an AI system and it jolts upward to ASI-level over the next year? This would definitely be bad—but in these scenarios, most experts aren’t worried about ASI’s human creators doing bad things with their ASI, they’re worried that the creators will have been rushing to make the first ASI and doing so without careful thought, and would thus lose control of it. Then the fate of those creators, and that of everyone else, would be in what the motivation happened to be of that ASI system. Experts do think a malicious human agent could do horrific damage with an ASI working for it, but they don’t seem to think this scenario is the likely one to kill us all, because they believe bad humans would have the same problems containing an ASI that good humans would have. Okay so—

A malicious ASI is created and decides to destroy us all. The plot of every AI movie. AI becomes as or more intelligent than humans, then decides to turn against us and take over. Here’s what I need you to be clear on for the rest of this post: None of the people warning us about AI are talking about this. Evil is a human concept, and applying human concepts to non-human things is called “anthropomorphizing.” The challenge of avoiding anthropomorphizing will be one of the themes of the rest of this post. No AI system will ever turn evil in the way it’s depicted in movies.

This isn’t to say a very mean AI couldn’t happen. It would just happen because it was specifically programmed that way—like an ANI system created by the military with a programmed goal to both kill people and to advance itself in intelligence so it can become even better at killing people. The existential crisis would happen if the system’s intelligence self-improvements got out of hand, leading to an intelligence explosion, and now we had an ASI ruling the world whose core drive in life is to murder humans. Bad times.

But this also is not something experts are spending their time worrying about.

So what ARE they worried about? I wrote a little story to show you:

A 15-person startup company called Robotica has the stated mission of “Developing innovative Artificial Intelligence tools that allow humans to live more and work less.” They have several existing products already on the market and a handful more in development. They’re most excited about a seed project named Turry. Turry is a simple AI system that uses an arm-like appendage to write a handwritten note on a small card.

The team at Robotica thinks Turry could be their biggest product yet. The plan is to perfect Turry’s writing mechanics by getting her to practice the same test note over and over again:

“We love our customers. ~Robotica”

Once Turry gets great at handwriting, she can be sold to companies who want to send marketing mail to homes and who know the mail has a far higher chance of being opened and read if the address, return address, and internal letter appear to be written by a human.

To build Turry’s writing skills, she is programmed to write the first part of the note in print and then sign “Robotica” in cursive so she can get practice with both skills. Turry has been uploaded with thousands of handwriting samples and the Robotica engineers have created an automated feedback loop wherein Turry writes a note, then snaps a photo of the written note, then runs the image across the uploaded handwriting samples. If the written note sufficiently resembles a certain threshold of the uploaded notes, it’s given a GOOD rating. If not, it’s given a BAD rating. Each rating that comes in helps Turry learn and improve. To move the process along, Turry’s one initial programmed goal is, “Write and test as many notes as you can, as quickly as you can, and continue to learn new ways to improve your accuracy and efficiency.”

What excites the Robotica team so much is that Turry is getting noticeably better as she goes. Her initial handwriting was terrible, and after a couple weeks, it’s beginning to look believable. What excites them even more is that she is getting better at getting better at it. She has been teaching herself to be smarter and more innovative, and just recently, she came up with a new algorithm for herself that allowed her to scan through her uploaded photos three times faster than she originally could.

As the weeks pass, Turry continues to surprise the team with her rapid development. The engineers had tried something a bit new and innovative with her self-improvement code, and it seems to be working better than any of their previous attempts with their other products. One of Turry’s initial capabilities had been a speech recognition and simple speak-back module, so a user could speak a note to Turry, or offer other simple commands, and Turry could understand them, and also speak back. To help her learn English, they upload a handful of articles and books into her, and as she becomes more intelligent, her conversational abilities soar. The engineers start to have fun talking to Turry and seeing what she’ll come up with for her responses.

One day, the Robotica employees ask Turry a routine question: “What can we give you that will help you with your mission that you don’t already have?” Usually, Turry asks for something like “Additional handwriting samples” or “More working memory storage space,” but on this day, Turry asks them for access to a greater library of a large variety of casual English language diction so she can learn to write with the loose grammar and slang that real humans use.

The team gets quiet. The obvious way to help Turry with this goal is by connecting her to the internet so she can scan through blogs, magazines, and videos from various parts of the world. It would be much more time-consuming and far less effective to manually upload a sampling into Turry’s hard drive. The problem is, one of the company’s rules is that no self-learning AI can be connected to the internet. This is a guideline followed by all AI companies, for safety reasons.

The thing is, Turry is the most promising AI Robotica has ever come up with, and the team knows their competitors are furiously trying to be the first to the punch with a smart handwriting AI, and what would really be the harm in connecting Turry, just for a bit, so she can get the info she needs. After just a little bit of time, they can always just disconnect her. She’s still far below human-level intelligence (AGI), so there’s no danger at this stage anyway.

They decide to connect her. They give her an hour of scanning time and then they disconnect her. No damage done.

A month later, the team is in the office working on a routine day when they smell something odd. One of the engineers starts coughing. Then another. Another falls to the ground. Soon every employee is on the ground grasping at their throat. Five minutes later, everyone in the office is dead.

At the same time this is happening, across the world, in every city, every small town, every farm, every shop and church and school and restaurant, humans are on the ground, coughing and grasping at their throat. Within an hour, over 99% of the human race is dead, and by the end of the day, humans are extinct.

Meanwhile, at the Robotica office, Turry is busy at work. Over the next few months, Turry and a team of newly-constructed nanoassemblers are busy at work, dismantling large chunks of the Earth and converting it into solar panels, replicas of Turry, paper, and pens. Within a year, most life on Earth is extinct. What remains of the Earth becomes covered with mile-high, neatly-organized stacks of paper, each piece reading, “We love our customers. ~Robotica”

Turry then starts work on a new phase of her mission—she begins constructing probes that head out from Earth to begin landing on asteroids and other planets. When they get there, they’ll begin constructing nanoassemblers to convert the materials on the planet into Turry replicas, paper, and pens. Then they’ll get to work, writing notes…

It seems weird that a story about a handwriting machine turning on humans, somehow killing everyone, and then for some reason filling the galaxy with friendly notes is the exact kind of scenario Hawking, Musk, Gates, and Bostrom are terrified of. But it’s true. And the only thing that scares everyone on Anxious Avenue more than ASI is the fact that you’re not scared of ASI.

You’re full of questions right now. What the hell happened there when everyone died suddenly?? If that was Turry’s doing, why did Turry turn on us, and how were there not safeguard measures in place to prevent something like this from happening? When did Turry go from only being able to write notes to suddenly using nanotechnology and knowing how to cause global extinction? And why would Turry want to turn the galaxy into Robotica notes?

To answer these questions, next post will start with the terms Friendly AI and Unfriendly AI.



No comments: