Billy Brown wrote:
> Nick Bostrom wrote:
> > Paradoxically, I think that when we move up to the level of an SI
> > this problem gets easier again, since we can formulate its values in
> > a human language..
>
> and
>
> > That kind of unintended consequences can be easily avoided, it seems,
> > if we explicitly give the SI the desire to interpret all its values
> > the way its human creators intended them..
>
> I was actually sticking to human-level and moderately superhuman entities (I
> try to avoid making predictions about what an SI would do). In that realm,
> the problem we face is that specifying a set of moral principles does not
> uniquely determine what an entity will actually decide to do.
I agree that this can be a greater problem when we are talking about ~human-level AIs. For such entities, however, there should be more standard safety-measures that would be adequate (confinement, or having a group of people closely monitor their actions). The potential danger would only arise with seriously superhuman malicious intellects.
> > Question: What are the criteria whereby the SI determines whether its
> > fundamental values "need" to be changed?
>
> The same criteria you or I would use.
But I don't think we deliberately change our fundamental values. Non-fundamental values we may change, and the criteria are then our more fundamental values. Fundamental values can change too, but they are not deliberately (rationally) changed (except in the mind-scan situation I mentioned in an earlier messege).
> Anyone who thinks deeply about these
> issues will discover that their fundamental values contain all sorts of
> conflicts, ambiguities and limitations. Anyone who lives in a changing
> world will find the need to apply their values to situations no one has ever
> thought of before.
You could certainly decide to change the way you "apply" the values to a specific situation, i.e. you may change your mind as to what is the most effective way of serving your fundamental values.
> > > Worse, what happens when it decides to improve its own goal system?
> >
> > Improve according to what standard?
>
> No piece of software is perfect. A self-enhancing entity is going to find
> ways of re-writing the system to make it faster, more flexible, and better
> able to deal with difficult problems. Eventually it will find
> decision-making methods that work better than ours, but use completely alien
> mechanisms. consider, for example, a mind built using an optimized
> combination of conventional AI, neural networks, populations of genetic
> algorithms and quantum computers. Would we really expect such an entity to
> be anything like us?
I don't count any of that as a change in fundamental values.
> there is one additional point I'd like to make about all of this, because it
> is easy to overlook. All of the mechanisms I've brought up result in the AI
> adopting viewpoints that we ourselves might agree with, if we had the same
> intelligence and experience. The posthumans won't just wake up and decide
> to be psychotic one morning. If they eventually decide to adopt a morality
> we don't like, it will be because our own values naturally lead to that
> position.
>
> Now, isn't that as much as we have any right to expect?
That depends. If selection pressures lead to the evolution of AIs with selfish values that are indifferent to human welfare, and the AIs as a result go about annihilating the human species and stealing our resources, then I would say emphatically NO, we have a right to expect more.
Nick Bostrom
http://www.hedweb.com/nickb n.bostrom@lse.ac.uk
Department of Philosophy, Logic and Scientific Method
London School of Economics