RE: Goals was: Transparency and IP

From: Samantha Atkins (samantha@objectent.com)
Date: Sat Sep 16 2000 - 04:12:58 MDT

Next message: Harvey Newstrom: "RE: Color blindness"
Previous message: Damien Broderick: "self-extracting zipware AI 'casting"
Maybe in reply to: Dan Fabulich: "Goals was: Transparency and IP"
Next in thread: Zero Powers: "Re: Goals was: Transparency and IP"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

"Eliezer S. Yudkowsky" wrote:
>
> Well, I've been convinced of one thing; at some point I need to drop the
> things that I dropped everything else to do, and go write a Webpage
> specifically about Friendly AI.
>
> Samantha Atkins wrote:
> >
> > Dan Fabulich wrote:
> > >
> > > Samantha Atkins wrote:
> > >
> > > > Your Sysop has extremely serious problems in its design. It is
expected
> > > > to know how to resolve the problems and issues of other sentient
beings
> > > > (us) without having ever experienced what it is to be us. If it is
> > > > trained to model us well enough to understand and therefore to
wisely
> > > > resolve conflicts then it will in the process become subject
potentially
> > > > to some of the same troubling issues.
> > >
> > > Because everybody who trains dogs and learns how to deal with/predict
> > > their behavior starts acting and thinking just like a dog, right?
> >
> > To a degree sufficient to predict the dogs behavior and
> > stimulus/response patterns, yes.
>
> I'm afraid Dan Fabulich is right about this one, Samantha. As I just
recently
> noted in a post, a human requires sympathy as a prerequisite for empathy.
> This is because we have a great deal of built-in neural hardware which we
> don't understand, and which would probably be beyond the capabilities of
our
> abstract thoughts to fully simulate in any case; certainly the current
state
> of evolutionary psychology is not developed enough that we could do
> deliberate, completely abstract simulations of emotions. Thus we humans
must
> sympathize in order to understand.
>
> The same is not true of a seed AI, of course. For one thing, the
distinction
> between "hardware" intelligence and "software" intelligence is damned thin
-
> if a seed AI can possess the source code for emotions, then it can
mentally
> simulate the source code for emotions with just about exactly the same
amount
> of effort. There is an equivalence between actual implementation and
abstract
> understanding.
>

If the distinction is as thin as you say then I don't see that the "of
course" is justified. The simulation is a not that dissimilar from what
human beings do to attempt to understand and predict the behavior of
other people. The AI simulating a human's response patterns and
emotions is to that extent engaged in a process of "thinking like a
human".

I am not sure how this AI manages to acquire the "source code for human
emotions " though. I am not at all sure that it could ever fully
understand
how human beings feel, the qualia, without at least uploading a few. If
we
can't adequately decipher or describe these things then we certainly
can't
simply teach the AI about such.

This brings me back to the position that I seriously doubt
the AI will understand humans adequately enough to deal with us with
much
wisdom.

> > The IGS is inadequate to answer the concern. It merely says that giving
> > the AI initial non-zero-value goals is obviously necessary and TBD.
>
> Well, it's been quite a long time - more than two years - since I wrote
that
> particular piece of text, but as I recall that is precisely *not* what I
> said. What I was trying to point out is that a goal system, once created,
has
> its own logic. We very specifically do NOT need to give it any initial
goals
> in order to get the AI running. I *now*, *unlike* my position of two
years
> earlier, believe that we can give it default goals to be implemented in
the
> event that there is no "meaning of life". The Interim goals will still
> materialize in one form or another, but if the programmer knows this, the
> Interim goals can be part of a coherent whole - as they are in my very own
> goal system.
>
> The primary thing that gives me confidence in my ability to create a
Friendly
> AI is the identity of my own abstract goal system with that which I wish
to
> build. When I thought that Interim goals were sufficient, I would have
> professed no other goals myself. When I myself moved to an split-case
> scenario, one for objective morality and one for subjective morality, it
then
> became possible for me to try and build an AI based on the same model.
The
> fact that I myself moved from a wholly Interim to an Interim/subjective
> scenario gives me some confidence that trying to construct an
> Interim/subjective scenario will not automatically collapse to a wholly
> Interim scenario inside the seed AI.
>

Fair enough. But then "you" have all kinds of internal machinery that
will
not be part of an AI. You are not a "pure reasoner" or tabula rasa in
the
sense and to the extent the AI will be. So it is hardly obvious that
your own journey in morality space will carry over to the AI or that it
makes it more like it will reach similar conclusions or stop at a place
which you believe is safe and friendly.

> > Our goals are our plans based on our values as worked toward in external
> > reality. They depend on values, on what it is we seek.
>
> This is the problem with trying to think out these problems as anything
except
> problems in cognitive architecture. You can't define words in terms of
other
> words; you have to take apart the problem into simpler parts.
>

What are the simpler parts and how may we speak/think of them if they
are
not words or some form of signifiers? Please show me how the AI will
get
values and what it will value and why without any first level urgings,
directives or fundamental built-in goals. There is a bit of a
boot-strapping problem here.

> A goal is a mental image such that those decisions which are deliberately
> made, are made such that the visualized causal projection of
> world-plus-decision will result in a world-state fulfilling that mental
> image. That's a start.
>

OK. So where does the goal, the mental image, come from in the case of
the
AI?

> > I am not sure I can agree that cognitive goals are equivalent to
> > cognitive propositions about goals. That leaves something out and
> > becomes circular.
>
> It leaves nothing out and is fundamental to an understanding of seed AI.
> Self-modifying, self-understanding, and self-improving. Decisions are
made on
> the thought level. "I think that the value of goal-whatever ought to be
37"
> is equivalent to having a goal with a value of 37 - if there are any goals
> implemented on that low a level at all, and not just thought-level
reflexes
> making decisions based on beliefs about goals. My current visualization
has
> no low-level "goal" objects at all; in fact, I would now say that such a
> crystalline system could easily be dangerous.
>

By why think the value of goal-whatever is 37 rather than 3 or
infinity? Is
it arbitrary? Where does the weighting come from, the thought steps
unless
there are undergirding goals/values of some kind? The AI seems to lack
a useable analog of human neurological structure to undergird a value
system. What forms the reflexes you say could be a basis for the
evaluations? How are the reflexes validated? In terms of, in relation
to, what? Does the base spontaneously arise out of the Void as it were
or is the system baseless?

> > Questions of morality are not questions of fact
> > unless the underlying values are all questions of fact and shown to be
> > totally objective and/or trustworthy. The central value is most likely
> > hard-wired or arbitrarily chosen for any value driven system.
>
> That is one of two possibilities. (Leaving out the phrase "hard-wired",
which
> is an engineering impossibility.)
>

Hard-wired is precisely the possibility that is most available when
designing a system rather than just having it grow. Granted that the AI
may
at some point decide to rewire itself. But by that point it could
generate
its own answers to the problems I propose.

> > Part of the very flexibility and power of human minds grows out of the
> > dozens of modules each with their own agenda and point of view
> > interacting. Certainly an AI can be built will as many conflicting
> > basic working assumptions and logical outgrowths thereof as we could
> > wish. My suspicion is that we must build a mind this way if it is going
> > to do what we hope this AI can do.
>
> The AI needs to understand this abstractly; it does not necessarily need
to
> incorporate such an architecture personally.
>

I believe that for some types of problems this sort of temporary
fragmentation into different viewpoints is nearly essential. For some
processing tasks on even todays computer systems this sort of approach
is
quite useful. When there are multiple possible points of view about a
particular subject each of which is neither strong enough to swamp out
all
others or able to be dismissed or subsumed, then creating in parallel
examining the subject from the N different perspectives can lead to much
greater insight and synergy among the points of view. It is something I
do often in my own thinking.

I am not sure that even a Singularity class mind will not stagnate in
its
thinking if it is relatively uniform and has no other sentience (or at
least
one close enough in ability) to rub ideas with. Assuming for a moment
one such local Mind, it has little choice but to set up sub-minds within
itself is this is actually a valid concern.

> > Science does not and never has required that whenever two humans
> > disagree that only one is right. They can at times both be right within
> > the context of different fundamental assumptions.
>
> I suggest reading "Interlude: The Consensus and the Veil of Maya" from
CaTAI
> 2.2.
>

I will.

---aside ---

Hmmm. I don't fully agree with that pattern of argument in that bit of
reading material but it will take me too far afield (and require digging
out too many philosophical reference) to go into it right at the
moment. Very briefly though, any sort of apparatus for sensing facts
about "external reality" will have some qualities or other that define
its limits and that some could argue it never perceives reality. But
this argument is somewhat strained. By virtue of having the means to
perceive and process information, which we agree always means have means
of some characterstic type that defines their limits, we are doomed not
to be able to perceive reality. Quite perverse philosophically.

Ah, ok. You see the problem also. I disagree somewhat about some of
your points about definitions. Definitions are attempts to say what
concepts are tags for or rather to say how a concept is related to other
concepts and give some indications about what sorts of concretes (or
lower level concepts) it subsumes. You can't think or communicate
without them. Mathematics is a set of pure concepts and relationships
defined outside of or as an abstraction of concepts of reality. The
last thing you can afford to do is "forget about definitions". I think
you are playing a bit fast and loose there. But then so am I in this
brief response.

-- back to the current post ---

> > Rebellion will become present as a potentiality in any intelligent
> > system that has goals whose acheivement it perceives as stymied by other
> > intelligences that in some sense control it. It does not require human
> > evolution for this to arise. It is a logical response in the face of
> > conflicts with other agents. That it doesn't have the same emotional
> > tonality when an AI comes up with it is irrelevant.
>
> Precisely. If the goal is to be friendly to humans, and humans interfere
with
> that goal, then the interference will be removed - not the humans
themselves;
> that would be an instance of a subgoal stomping on a supergoal. So of
course
> the Sysop will invent nanotechnology and become independent of human
> interference. I've been quite frank about this; I even said that the
Sysop
> Scenario was a logical consequence and not something that would even have
to
> go in the Sysop Instructions directly.
>

Of course a goal to be "friendly" to "humans" is subject to all the
interesting problems you have written about starting with changing
definitions of what the key concepts do and do not mean and subsume. Is
it
friendly to override humans and human opinion considering the type of
creatures we are even if it is "for our own good" that the AI is sworn
to uphold? Should the type of creatures we are be modified to make
the task less full of contradictions?

> > Creation of a new and more powerful mind and the laying of its
> > foundations is a vastly challenging task. It should not be assumed that
> > we have a reasonably complete handle on the problems and issues yet.
>
> Nor do I. I'm still thinking it through. But if I had to come up with a
> complete set of Sysop Instructions and do it RIGHT NOW, I could do so -
I'd
> just have to take some dangerous shortcuts, mostly consisting of temporary
> reliance on trustworthy humans. I expect that by the time the seed AI is
> finished, we'll know enough to embody the necessary trustworthiness in the
AI.
>

I look forward to seeing such a set of instructions. Or do you believe
it
should be kept secret to the actual team building the AI? If so why? I
saw
notes and have heard you say that you no longer believe Open Source is
the way to go in development of this and are headed toward a closed
project. But I haven't seen the reasoning behind this. Please give me
a pointer if this exists somewhere.

- samantha

Next message: Harvey Newstrom: "RE: Color blindness"
Previous message: Damien Broderick: "self-extracting zipware AI 'casting"
Maybe in reply to: Dan Fabulich: "Goals was: Transparency and IP"
Next in thread: Zero Powers: "Re: Goals was: Transparency and IP"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Mon Oct 02 2000 - 17:38:11 MDT