RE: Posthuman mind control (was RE: FAQ Additions)

Billy Brown (bbrown@conemsco.com)
Fri, 26 Feb 1999 08:08:28 -0600

Nick Bostrom wrote:
> Paradoxically, I think that when we move up to the level of an SI
> this problem gets easier again, since we can formulate its values in
> a human language..

and

> That kind of unintended consequences can be easily avoided, it seems,
> if we explicitly give the SI the desire to interpret all its values
> the way its human creators intended them..

I was actually sticking to human-level and moderately superhuman entities (I try to avoid making predictions about what an SI would do). In that realm, the problem we face is that specifying a set of moral principles does not uniquely determine what an entity will actually decide to do.

We can see many examples of this by observing each other. Even people who subscribe to the same philosophy often have bitter arguments about the practical consequences of their principles. This problem becomes worse with AIs that are smarter than we are, because they will see implications of our principles that we haven't thought of yet.

> Question: What are the criteria whereby the SI determines whether its
> fundamental values "need" to be changed?

The same criteria you or I would use. Anyone who thinks deeply about these issues will discover that their fundamental values contain all sorts of conflicts, ambiguities and limitations. Anyone who lives in a changing world will find the need to apply their values to situations no one has ever thought of before.

Now, I wouldn't abandon the whole foundation of my moral system overnight, and I don't expect the AIs to do it either. But I would expect my ideas to change over time, and with enough time I hesitate to predict what the end result might be. I would expect the AIs to gradually develop their moral systems in response to new ideas and new situations, and I would expect that they will occasionally make adjustments to even their most fundamental values.

With human-equivalent AIs, this process could take thousands of years to produce major changes. With posthuman AIs, who live hundreds or thousands of times faster than we do, the evolution would be much faster.

> > Worse, what happens when it decides to improve its own goal system?
>
> Improve according to what standard?

No piece of software is perfect. A self-enhancing entity is going to find ways of re-writing the system to make it faster, more flexible, and better able to deal with difficult problems. Eventually it will find decision-making methods that work better than ours, but use completely alien mechanisms. consider, for example, a mind built using an optimized combination of conventional AI, neural networks, populations of genetic algorithms and quantum computers. Would we really expect such an entity to be anything like us?

there is one additional point I'd like to make about all of this, because it is easy to overlook. All of the mechanisms I've brought up result in the AI adopting viewpoints that we ourselves might agree with, if we had the same intelligence and experience. The posthumans won't just wake up and decide to be psychotic one morning. If they eventually decide to adopt a morality we don't like, it will be because our own values naturally lead to that position.

Now, isn't that as much as we have any right to expect?

Billy Brown, MCSE+I
bbrown@conemsco.com