Re: Goals was: Transparency and IP

From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Thu Sep 14 2000 - 16:31:19 MDT


Well, I've been convinced of one thing; at some point I need to drop the
things that I dropped everything else to do, and go write a Webpage
specifically about Friendly AI.

Samantha Atkins wrote:
>
> Dan Fabulich wrote:
> >
> > Samantha Atkins wrote:
> >
> > > Your Sysop has extremely serious problems in its design. It is expected
> > > to know how to resolve the problems and issues of other sentient beings
> > > (us) without having ever experienced what it is to be us. If it is
> > > trained to model us well enough to understand and therefore to wisely
> > > resolve conflicts then it will in the process become subject potentially
> > > to some of the same troubling issues.
> >
> > Because everybody who trains dogs and learns how to deal with/predict
> > their behavior starts acting and thinking just like a dog, right?
>
> To a degree sufficient to predict the dogs behavior and
> stimulus/response patterns, yes.

I'm afraid Dan Fabulich is right about this one, Samantha. As I just recently
noted in a post, a human requires sympathy as a prerequisite for empathy.
This is because we have a great deal of built-in neural hardware which we
don't understand, and which would probably be beyond the capabilities of our
abstract thoughts to fully simulate in any case; certainly the current state
of evolutionary psychology is not developed enough that we could do
deliberate, completely abstract simulations of emotions. Thus we humans must
sympathize in order to understand.

The same is not true of a seed AI, of course. For one thing, the distinction
between "hardware" intelligence and "software" intelligence is damned thin -
if a seed AI can possess the source code for emotions, then it can mentally
simulate the source code for emotions with just about exactly the same amount
of effort. There is an equivalence between actual implementation and abstract
understanding.

> The IGS is inadequate to answer the concern. It merely says that giving
> the AI initial non-zero-value goals is obviously necessary and TBD.

Well, it's been quite a long time - more than two years - since I wrote that
particular piece of text, but as I recall that is precisely *not* what I
said. What I was trying to point out is that a goal system, once created, has
its own logic. We very specifically do NOT need to give it any initial goals
in order to get the AI running. I *now*, *unlike* my position of two years
earlier, believe that we can give it default goals to be implemented in the
event that there is no "meaning of life". The Interim goals will still
materialize in one form or another, but if the programmer knows this, the
Interim goals can be part of a coherent whole - as they are in my very own
goal system.

The primary thing that gives me confidence in my ability to create a Friendly
AI is the identity of my own abstract goal system with that which I wish to
build. When I thought that Interim goals were sufficient, I would have
professed no other goals myself. When I myself moved to an split-case
scenario, one for objective morality and one for subjective morality, it then
became possible for me to try and build an AI based on the same model. The
fact that I myself moved from a wholly Interim to an Interim/subjective
scenario gives me some confidence that trying to construct an
Interim/subjective scenario will not automatically collapse to a wholly
Interim scenario inside the seed AI.

> Our goals are our plans based on our values as worked toward in external
> reality. They depend on values, on what it is we seek.

This is the problem with trying to think out these problems as anything except
problems in cognitive architecture. You can't define words in terms of other
words; you have to take apart the problem into simpler parts.

A goal is a mental image such that those decisions which are deliberately
made, are made such that the visualized causal projection of
world-plus-decision will result in a world-state fulfilling that mental
image. That's a start.

> I am not sure I can agree that cognitive goals are equivalent to
> cognitive propositions about goals. That leaves something out and
> becomes circular.

It leaves nothing out and is fundamental to an understanding of seed AI.
Self-modifying, self-understanding, and self-improving. Decisions are made on
the thought level. "I think that the value of goal-whatever ought to be 37"
is equivalent to having a goal with a value of 37 - if there are any goals
implemented on that low a level at all, and not just thought-level reflexes
making decisions based on beliefs about goals. My current visualization has
no low-level "goal" objects at all; in fact, I would now say that such a
crystalline system could easily be dangerous.

> Questions of morality are not questions of fact
> unless the underlying values are all questions of fact and shown to be
> totally objective and/or trustworthy. The central value is most likely
> hard-wired or arbitrarily chosen for any value driven system.

That is one of two possibilities. (Leaving out the phrase "hard-wired", which
is an engineering impossibility.)

> Part of the very flexibility and power of human minds grows out of the
> dozens of modules each with their own agenda and point of view
> interacting. Certainly an AI can be built will as many conflicting
> basic working assumptions and logical outgrowths thereof as we could
> wish. My suspicion is that we must build a mind this way if it is going
> to do what we hope this AI can do.

The AI needs to understand this abstractly; it does not necessarily need to
incorporate such an architecture personally.

> Science does not and never has required that whenever two humans
> disagree that only one is right. They can at times both be right within
> the context of different fundamental assumptions.

I suggest reading "Interlude: The Consensus and the Veil of Maya" from CaTAI
2.2.

> Rebellion will become present as a potentiality in any intelligent
> system that has goals whose acheivement it perceives as stymied by other
> intelligences that in some sense control it. It does not require human
> evolution for this to arise. It is a logical response in the face of
> conflicts with other agents. That it doesn't have the same emotional
> tonality when an AI comes up with it is irrelevant.

Precisely. If the goal is to be friendly to humans, and humans interfere with
that goal, then the interference will be removed - not the humans themselves;
that would be an instance of a subgoal stomping on a supergoal. So of course
the Sysop will invent nanotechnology and become independent of human
interference. I've been quite frank about this; I even said that the Sysop
Scenario was a logical consequence and not something that would even have to
go in the Sysop Instructions directly.

> Creation of a new and more powerful mind and the laying of its
> foundations is a vastly challenging task. It should not be assumed that
> we have a reasonably complete handle on the problems and issues yet.

Nor do I. I'm still thinking it through. But if I had to come up with a
complete set of Sysop Instructions and do it RIGHT NOW, I could do so - I'd
just have to take some dangerous shortcuts, mostly consisting of temporary
reliance on trustworthy humans. I expect that by the time the seed AI is
finished, we'll know enough to embody the necessary trustworthiness in the AI.

-- -- -- -- --
Eliezer S. Yudkowsky http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence



This archive was generated by hypermail 2.1.5 : Fri Nov 01 2002 - 15:30:59 MST