RE: Posthuman mind control (was RE: FAQ Additions)

From: Nick Bostrom (bostrom@ndirect.co.uk)
Date: Thu Feb 25 1999 - 18:15:54 MST

Next message: ASpidle@aol.com: "Re: Re: Extropianism & Theology"
Previous message: Brent Allsop: "MEDIA: Forbes ASAP"
In reply to: Billy Brown: "RE: Posthuman mind control (was RE: FAQ Additions)"
Next in thread: Eliezer S. Yudkowsky: "Re: Posthuman mind control"
Reply: Eliezer S. Yudkowsky: "Re: Posthuman mind control"
Reply: Billy Brown: "RE: Posthuman mind control (was RE: FAQ Additions)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Billy Brown wrote:

> Here we have the root of our disagreement. The problem rests on an
> implementation issue that people tend to gloss over: how exactly do you
> ensure that the AI doesn't violate its moral directives?

So we have now narrowed it down to an implementation issue. Good.

> For automatons this is pretty straightforward. The AI is incapable of doing
> anything except blindly following whatever orders it is given. Any
> safeguards will be simple things, like "stop moving if the forward bumper
> detects an impact". They won't be perfect, of course, but that is only
> because we can never anticipate every possible situation that might come up.
>
> For more complex, semi-intelligent devices the issue is harder.

Agreed.

Paradoxically, I think that when we move up to the level of an SI
this problem gets easier again, since we can formulate its values in
a human language.

> For instance, the initial directive "harming humans is wrong" can easily
> form a foundation for "harming sentient life is wrong", leading to "harming
> living things is wrong" and then to "killing cows is morally equivalent to
> killing humans". Since "it is permissible to use force to prevent murder"
> is likely to be part of the same programming, we could easily get an AI that
> is willing to blow up McDonald's in order to save the cows!

That kind of unintended consequences can be easily avoided, it seems,
if we explicitly give the SI the desire to interpret all its values
the way its human creators intended them.

> Once you start talking about self-modifying AI with posthuman intelligence,
> the problem gets even worse. Now we're trying to second-guess something
> that is smarter than we are. If the intelligence difference is large, we
> should expect it to interpret our principles in ways we never dreamed of.
> It will construct a philosophy far more elaborate than ours, with better
> grounding in reality. Sooner or later, we should expect that it will decide
> some of its fundamental values need to be amended in some way - after all, I
> doubt that our initial formulation of these values will be perfect.

Question: What are the criteria whereby the SI determines whether its
fundamental values "need" to be changed?

> But if
> it can do that, then its values can eventually mutate into something with no
> resemblance to their original state.
>
> Worse, what happens when it decides to improve its own goal system?

Improve according to what standard?

Nick Bostrom
http://www.hedweb.com/nickb n.bostrom@lse.ac.uk
Department of Philosophy, Logic and Scientific Method
London School of Economics

Next message: ASpidle@aol.com: "Re: Re: Extropianism & Theology"
Previous message: Brent Allsop: "MEDIA: Forbes ASAP"
In reply to: Billy Brown: "RE: Posthuman mind control (was RE: FAQ Additions)"
Next in thread: Eliezer S. Yudkowsky: "Re: Posthuman mind control"
Reply: Eliezer S. Yudkowsky: "Re: Posthuman mind control"
Reply: Billy Brown: "RE: Posthuman mind control (was RE: FAQ Additions)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.5 : Fri Nov 01 2002 - 15:03:09 MST