Singularity: AI Morality

From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Sun Dec 06 1998 - 11:13:25 MST


Nick Bostrom wrote:
>
> > Eliezer Yudkowsky wrote:
> >
> > Not at all! If that is really and truly and objectively the moral thing to
> > do, then we can rely on the Post-Singularity Entities to be bound by the same
> > reasoning. If the reasoning is wrong, the PSEs won't be bound by it. If the
> > PSEs aren't bound by morality, we have a REAL problem
>
> Indeed. And this is another point where I seem to disagree with you.
> I am not at all certain that being superintelligent implies being
> moral. Certainly there are very intelligent humans that are also very
> wicked; I don't see why once you pass a certain threshold of
> intelligence then it is no longer possible to be morally bad. What I
> might agree with, is that once you are sufficiently intelligent then
> you should be able to recognize what's good and what's bad. But
> whether you are motivated to act in accordance with these moral
> convictions is a different question. What weight you give to moral
> imperatives in planning your actions depends on how altruistic/moral
> you are. We should therefore make sure that we build in strong moral
> drives into the superintelligences. (Presumably, we would also want
> to link these moral drives to a moral system that places a great
> value on human survival; because that way we would increase our own
> chances of survival.)

This, in my opinion, is exactly the wrong answer. (See particularly the
"Prime Directive of AI" in "Coding a Transhuman AI".) But think about what
you just said. First you say that sufficient intelligence should be able to
recognize good and bad. Then you say that we should build in a moral system
with a particular set of values. What if we get it wrong? What if the two
values conflict? Would you really want to be around the AI when that
happened? I would prefer to be very far away, like the Magellanic Clouds. It
might try to remove the source of the conflict.

Once again, we have a conflict between the self-propelled trajectory and the
convergence to truth. Even putting on my "human allegiance" hat, I think
self-propelled trajectories would be a terrible idea because I have no goddamn
idea where they would wind up. Do you really know all the logical
consequences of placing a large value on human survival? Would you care to
define "human" for me? Oops! Thanks to your overly rigid definition, you
will live for billions and trillions and googolplexes of years, prohibited
from uploading, prohibited even from ameliorating your own boredom, endlessly
screaming, until the soul burns out of your mind, after which you will
continue to scream. I would prefer a quick death to creating our own hells,
and that is what we would inevitably do.

It's not just the particular example. It's not even that we can't predict all
the consequences of a particular system. It's the trajectory. I hope and
pray that the trajectory will converge to the (known) correct answers in any
case. I really do. Because if it doesn't converge there, I don't have the
goddamndest where it will end up. Any errors we make in the initial
formulation will either cancel out or amplify. If they cancel out, the AI
makes the correct moral choices. If they build up, you have positive feedback
into intelligence and insanity. If you're lucky, you'll wind up in a world of
incomprehensible magic, a world of twisted, insane genies obeying every order.
 If you're not lucky, you'll wind up in a hell beyond the ability of any of us
to imagine.

Of course, I could always be wrong. I'd say there's a 1% chance that AI
coercions could get us into "paradise" where the alternative is extermination.
 You'll have to prohibit human intelligence enhancement, however, or put the
same coercions on us. Imposing lasting coercions on the pre-existing human
goal system, even given that AI coercions work, takes another factor of 100
off the probability. If you can synchronize everyone's intelligence
enhancement perfectly, then eventually we'll probably coalesce into a
singleton indistinguishable from that resulting from an AI Transcend. And the
Amish will go kicking and screaming, so even the element of noncoercion is nonpresent.

Look, these forces are going to a particular place, and they are way, way,
waaaaaayyy too big for any of us to divert. Think of the Singularity as this
titanic, three-billion-ton truck heading for us. We can't stop it, but I
suppose we could manage to get run over trying to slow it down.

> >, but I don't see any way
> > of finding this out short of trying it.
>
> How to control an SI? Well, I think it *might* be possible through
> programming the right values into the SIs,

We should program the AI to seek out *correct* answers, not a particular set
of answers.

> but let's not go into that
> now.

Let's. Please. Now.

> > There's a far better chance that delay makes things much, much worse.
>
> I think it will all depend on the circumstances at the time. For
> example, what the state of art of nanotechnology is then. But you
> can't say that sooner is *always* better, although it may be a good
> rule of thumb. Clearly there are cases where it's more prudent to
> take more precausions before launch. And in the case of the
> singularity, we'd seem to be well advised to take as many precausions
> as we have time for.

I think that the amount of delay-caused deterioration depends on
circumstances, but not the sign. Let's substitute "intelligence enhancement"
for "Singularity" and reconsider. Is there really any circumstance under
which it is better to be stupid than smart, with the world at stake? If
you're second-guessing the transhumans, maybe, but we know where that leads.

> > Why not leave the moral obligations to the SIs, rather than trying (futilely
> > and fatally) to impose your moral guesses on them?
>
> Because, as I said above, if we build them in the wrong way they may
> not be moral.

I hope not. But "building in the wrong way" seems to me to imply a sloppy,
inelegant set of arbitrary, unsupported, ill-defined, and probably
self-contradictory assertions, rather than a tight chain of pure logic seeking
out the correct answers.

> Plus: whether it's moral or not, we would want to make
> sure that they are kind to us humans and allow us to upload.

No, we would NOT want to make sure of that. It would be immoral. Every bit
as immoral as torturing little children to death, but with a much higher
certainty of evil.

-- 
        sentience@pobox.com         Eliezer S. Yudkowsky
         http://pobox.com/~sentience/AI_design.temp.html
          http://pobox.com/~sentience/sing_analysis.html
Disclaimer:  Unless otherwise specified, I'm not telling you
everything I think I know.


This archive was generated by hypermail 2.1.5 : Fri Nov 01 2002 - 14:49:54 MST