Re: Otter vs. Yudkowsky

From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Sat Mar 18 2000 - 19:41:49 MST


"D.den Otter" wrote:
>
> Yes, that's the *practical* side of the dispute. There's also the
> philosophical issue of whether personal survival is more important
> than the creation of superintelligent successors, "egoism" vs
> "altruism" etc., of course. This inevitably adds an element of
> bias to the above debate.

I have no trouble seeing your point of view. I am not attempting to
persuade you to relinquish your selfishness; I am attempting to persuade
you that the correct action is invariant under selfishness, altruism,
and Externalism.

> > > Ok, you say something like 30% vs 0,1%, but how exactly
> > > did you get these figures? Is there a particular passage in
> > > _Coding..._ or _Plan to.._ that deals with this issue?
> >
> > 30% I pulled out of thin air. 0.1% is a consequence of symmetry and the
> > fact that only one unique chance for success exists.
>
> Out of thin air. Ok, so it could be really anything, including
> some value <0.1%.

No, that estimate is definitely incorrect. Using a value of less than
10% or more than 70% would be unjustifiable. 30% was "pulled out of the
air"; I'll happily defend the range itself.

More than 70% would be unjustifiable due to the Fermi Paradox and unknowability.

Since, if we can create a Sysop with specifiable stable goals, we win,
to assert that the probability is less than 10% would require
demonstrating that the probability of (A) External goals (and hostile
ones, at that), or (B) the probability that stable arbitrary goals can
be produced, are one or the other above 90%, or that their product is
above 90%; which requires a degree of definite knowledge about these
issues that nobody possesses. Even if it were possible to rationally
estimate the resulting "specifiable stable goals" probability as being
below 10%, which I do not think is the case, then it would be absurd to
argue it as being 1%. To say that a 99% probability of "no specifiable
goals" holds is to imply definite knowledge, which neither of us has.
 
> The longer you exist, the more opportunities there will be for
> something to go wrong. That's pretty much a mathematical
> certainty, afaik.

While I view the growth of knowledge and intelligence as an open-ended
process, essentially because I am an optimist, I do expect that all
reasoning applicable to basic goals will have been identified and
produced within a fairly small amount of time, with any remaining
revision taking place within the sixth decimal place. I expect the same
to hold of the True Basic Ultimate Laws of Physics as well. The problem
is finite; the applications may be infinite, and the variations may be
infinite, but the basic rules of reasoning, and any specific structure,
are finite.

> Those are two different kinds of "freedom". In the former case
> you can go everywhere, but you carry your own partial prison
> around in your head (like the guy from _Clockwork Orange_),
> while in the latter case you may not be able to go anywhere
> you want, but you *are* master of your own domain. I think
> I prefer the latter, not only because it is more "dignified"
> (silly human concept), but because it's wise to put as much
> distance and defences between you and a potential
> enemy as possible. When that enemy is sharing your
> body, you have a real problem.

Don't think of it as an enemy; think of it as an Operating System.

> > An intelligent,
> > self-improving AI should teach itself to be above programmer error and
> > even hardware malfunction.
>
> Yes, *if* you can get the basics right.

I'll answer for that.

> Natural evolution may have made some pretty bad mistakes, but
> that doesn't necessarily mean that *all* of our programming will become
> obsolete. If the SIs want to do something, they will have to stay
> alive to do it (unless of course they decide to kill themselves, but
> let's assume for the sake of argument that this won't be the case).
> Basic logic. So some sort of self-preservation "instinct" will be
> required(*) to keep the forces of entropy at bay. Survival requires
> control --the more the better-- over one's surroundings. Other
> intelligent entities represent by definition an area of deminished
> control, and must be studied and then placed in a threat/benefit
> hierarchy which will help to determine future actions. And voila,
> your basic social hierachy is born. The "big happy egoless
> cosmic family model" only works when the other sentients
> are either evolutionary dead-ends which are "guaranteed" to
> remain insignificant, or completely and permanently like-minded.

Nonsense. If the other sentients exist within a trustworthy Operating
System - I do think that a small Power should be able to design a
super-Java emulation that even a big Power shouldn't be able to break
out of; the problem is finite - then the other sentients pose no threat.
 Even if they do pose a threat, then your argument is analogous to
saying that a rational operating system, which views its goal as
providing the best possible environment for its subprocesses, will kill
off all processes because they are untrustworthy. As a logical chain,
this is simply stupid.

> No, no, no! It's exactly the other way around; goals are
> observer dependent by default. As far as we know this is
> the only way they *can* be.

I should correct my terminology; I should say that observer-*biased*
goals are simply evolutionary artifacts. Even if only
observer-dependent goals are possible, this doesn't rule out the
possibility of creating a Sysop with observer-unbiased goals.

> > > Fat comfort to *them* that the
> > > Almighty have decided so in their infinite wisdom. I wouldn't
> > > be much surprised if one of the "eternal truths" turns out to
> > > be "might makes right".
> >
> > *I* would. "Might makes right" is an evolutionary premise which I
> > understand in its evolutionary context. To find it in a Mind would be
> > as surprising as discovering an inevitable taste for chocolate.
>
> Evolution represents, among other things, some basic rules
> for survival. No matter how smart the SIs will become, they'll
> still have to play by the rules of this reality to live & prosper.

Your statement is simply incorrect. The possibility of a super-Java
encapsulation, which I tend to view as the default possibility - human
Java can be broken because humans make mistakes, humans make mistakes
because they're running with a high-level four-item short-term memory
and no codic cortex, and a superintelligence which knows all the laws of
physics and has a codic cortex should be able to design security a Power
couldn't crack; the problem is finite - directly contradicts the
necessity of all the survival activities you postulate.

> You can't deny self-evident truths like "might makes right"
> without paying the price (decreased efficiency, possibly
> serious damage or even annihilation) at some point. And
> yes, I also belief that suicide is fundamentally stupid,
> *especially* for a Power which could always alter its mind
> and bliss out forever if there's nothing better to do.

Only if the Power is set up to view this as desirable, and why would it
be? My current goal-system design plans don't even call for "pleasure"
as a separate module, just selection of actions on the basis of their
outcomes. And despite your anthropomorphism, this does not consist of
pleasure. Pleasure is a complex functional adaptation which responds to
success by reinforcing skills used, raising the level of mental energy,
and many other subtle and automatic effects that I see no reason to
preserve in an entity capable of consciously deciding how to modify
itself. In particular, your logic implies that the *real* supergoal is
get-success-feedback, and that the conditions for success feedback are
modifiable; this is not, however, an inevitable consequence of system
architecture, and would in fact be spectacularly idiotic; it would
require a deliberate effort by the system programmer to represent
success-feedback as a declarative goal on the same level as the other
initial supergoals, which would be simply stupid.

> The only
> logical excuse for killing yourself is when one knows for pretty
> damn sure, beyond all reasonable doubt, that the alternative
> is permanent, or "indefinite", hideous suffering.

Nonsense; this is simply den Otter's preconditions. I thought you had
admitted - firmly and positively asserted, in fact - that this sort of
thing was arbitrary?

> > > I don't know, it's rather difficult to imagine an untangled "I".
> >
> > So do I, but I can take a shot at it!
>
> But you don't, more or less by defintition, really know what
> you're doing!

Of course I do. I can see what a tangled "I" looks like, in detail,
which enables me to take a shot at imagining what an untangled "I" would
look like. And in particular, I can trace particular visualizations on
your part to features of that tangled "I" that would not be present in
an untangled system. I certainly understand enough to label specific
arguments as anthropomorphic and false, regardless of whether I really
have a constructive understanding of the design or behavior of an
untangled system.

> > When it reaches the point where any objective morality that exists would
> > probably have been discovered; i.e. when the lack of that discovery
> > counts as input to the Bayesian Probability Theorem. When continuing to
> > act on the possibility would be transparently stupid even to you and I,
> > we might expect it to be transparently stupid to the AI.
>
> In other words, objective morality will always be just an educated
> guess. Will there be a limit to evolution anyway? One would be
> inclined to say "yes, of course", but if this isn't the case, then
> the quest for objective morality will go on forever.

Well, you see "objective morality" as a romantic, floating label. I see
it as a finite and specifiable problem which, given true knowledge of
the ultimate laws of physics, can be immediately labeled as either
"existent" or "nonexistent" within the permissible system space.

> I'm sure you could make some pov-less freak in the lab, and
> keep it alive under "ideal", sterile conditions, but I doubt that
> it would be very effective in the real world. As I see it, we have
> two options: a) either the mind really has no "self" and no "bias"
> when it comes to motivation, in which case it will probably just
> sit there and do nothing, or b) it *does* have a "self", or creates
> one as a logical result of some pre-programmed goal(s), in
> which case it is likely to eventually become completely
> "selfish" due to a logical line of reasoning.

Again, nonsense. The Sysop would be viewable - would view itself -
simply as an intelligent process that acted to maintain maximum freedom
for the inhabitants, an operating system intended to provide equal
services for the human species, its user base. Your argument that
subgoals could interfere with these supergoals amounts to postulating
simple stupidity on the part of the Sysop. Worries about other
supergoals interfering are legitimate, and I acknowledge that, but your
alleged chain of survival logic is simply bankrupt.

> [snakes & rodents compared to AIs & humans]
> > It would be very much different. Both snakes and rodents evolved.
> > Humans may have evolved, but AIs haven't.
>
> But they will have to evolve in order to become SIs.

No, they wouldn't. "Evolution" is an extremely specific term yielding
phenomena such as selection pressures, adaptations, competition for
mates, and so on. An AI would need to improve itself to become Sysop;
this is quite a different proposition than evolution.

> I'd take the former, of course, but that's because the odds in this
> particular example are extremely (and quite unrealistically so)
> bad. In reality, it's not you vs the rest of humanity, but you vs
> a relative small financial/technological elite, many (most) of
> whom don't even fully grasp the potential of the machines they're
> working on. Most people will simply never know what hit them.

Even so, your chances are still only one in a thousand, tops - 0.1%, as
I said before.

> Anyway, there are no certainties. AI is not a "sure shot", but
> just another blind gamble, so the whole analogy sort of
> misses the point.

Not at all; my point is that AI is a gamble with a {10%..70%} chance of
getting 10^47 particles to compute with, while uploading is a gamble
with a {0.0000001%..0.1%} of getting 10^56. If you count in the rest of
the galaxy, 10^58 particles vs. 10^67.

> Power corrupts, and absolute power...this may not apply just
> to humans. Better have an Assembly of Independent Powers.
> Perhaps the first thing they'd do is try to assassinate eachother,
> that would be pretty funny.

"Funny" is an interesting term for it. You're anthropomorphizing again.
 What can you, as a cognitive designer, do with a design for a group of
minds that you cannot do with a design for a single mind? I think the
very concept that this constitutes any sort of significant innovation,
that it contributes materially to complexity in any way whatsover, is
evolved-mind anthropomorphism in fee simple. As I recall, you thought
approximately the same thing, back when you, I, and Nick Bostrum were
tearing apart Anders Sandberg's idea that an optimized design for a
Power could involve humanlike subprocesses.

> > > To ask of me to just forget about continuity is like asking
> > > you to just forget about the Singularity.
> >
> > I *would* just forget about the Singularity, if it was necessary.
>
> Necessary for what?

Serving the ultimate good.

> In what context (form your pov, not mine) is the Singularity
> rational, and why isn't this motivation just another irrational
> prejudice that just happens to have strongly linked itself to
> your pleasure centers? Aren't you really just rationalizing
> an essentially "irrational" choice (supergoal) like the rest of
> humanity?

No. I'm allowing the doubting, this-doesn't-make-sense part of my mind
total freedom over every part of myself and my motivations; selfishness,
altruism, and all. I'm not altruistic because my parents told me to be,
because I'm under the sway of some meme, or because I'm the puppet of my
romantic emotions; I'm altruistic because of a sort of absolute
self-cynicism under which selfishness makes even less sense than
altruism. Or at least that's how I'd explain things to a cynic.

> > If it's an irrational prejudice, then let it go.
>
> Then you'd have to stop thinking altogether, I'm afraid.

Anxiety! Circular logic! If you just let *go*, you'll find that your
mind continues to function, except that you don't have to rationalize
falsehoods for fear of what will happen if you let yourself see the
truth. Your mind will go on as before, just a little cleaner.

--
> Sure, but what's really interesting is that his conscience
> (aka "soul") is a *curse*, specifically designed to make him
> suffer (*).
I wasn't too impressed with the ethical reasoning of the Calderash
either, nor Angel's for that matter.  But - and totally new (to me)
insights like this one are one of the primary reasons I watch _Buffy_ -
real people, in real life, can totally ignore the structural ethics when
strong emotions are in play.  Giles and Xander blaming the returned
Angel for Angelus's deeds; Buffy torturing herself for sending someone
to Hell who would have gone there anyway - along with everyone else - if
she hadn't done a thing; and so on.
Still:  (1)  You'll notice that Angel hasn't committed suicide or
ditched his soul, both actions which he knows perfectly well how to
execute.  (2)  Why should I care what the Calderash gypsies think?  Does
Joss Whedon (Buffy-creator) know more about this than I do?  
-- 
       sentience@pobox.com      Eliezer S. Yudkowsky
          http://pobox.com/~sentience/beyond.html
                 Member, Extropy Institute
           Senior Associate, Foresight Institute


This archive was generated by hypermail 2.1.5 : Fri Nov 01 2002 - 15:27:30 MST