Nick Bostrom wrote:
> Paradoxically, I think that when we move up to the level of an SI
> this problem gets easier again, since we can formulate its values in
> a human language..
and
> That kind of unintended consequences can be easily avoided, it seems,
> if we explicitly give the SI the desire to interpret all its values
> the way its human creators intended them..
I was actually sticking to human-level and moderately superhuman entities (I try to avoid making predictions about what an SI would do). In that realm, the problem we face is that specifying a set of moral principles does not uniquely determine what an entity will actually decide to do.
We can see many examples of this by observing each other. Even people who subscribe to the same philosophy often have bitter arguments about the practical consequences of their principles. This problem becomes worse with AIs that are smarter than we are, because they will see implications of our principles that we haven't thought of yet.
> Question: What are the criteria whereby the SI determines whether its
> fundamental values "need" to be changed?
The same criteria you or I would use. Anyone who thinks deeply about these issues will discover that their fundamental values contain all sorts of conflicts, ambiguities and limitations. Anyone who lives in a changing world will find the need to apply their values to situations no one has ever thought of before.
Now, I wouldn't abandon the whole foundation of my moral system overnight, and I don't expect the AIs to do it either. But I would expect my ideas to change over time, and with enough time I hesitate to predict what the end result might be. I would expect the AIs to gradually develop their moral systems in response to new ideas and new situations, and I would expect that they will occasionally make adjustments to even their most fundamental values.
With human-equivalent AIs, this process could take thousands of years to produce major changes. With posthuman AIs, who live hundreds or thousands of times faster than we do, the evolution would be much faster.
> > Worse, what happens when it decides to improve its own goal system?
>
> Improve according to what standard?
No piece of software is perfect. A self-enhancing entity is going to find ways of re-writing the system to make it faster, more flexible, and better able to deal with difficult problems. Eventually it will find decision-making methods that work better than ours, but use completely alien mechanisms. consider, for example, a mind built using an optimized combination of conventional AI, neural networks, populations of genetic algorithms and quantum computers. Would we really expect such an entity to be anything like us?
Billy Brown, MCSE+I
bbrown@conemsco.com