Re: fluffy funny or hungry beaver?

From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Sun Jun 09 2002 - 15:52:36 MDT

Next message: Wei Dai: "Re: When Programs Benefit"
Previous message: Harvey Newstrom: "Re: extropians-digest V7 #150"
In reply to: Eugen Leitl: "Re: fluffy funny or hungry beaver?"
Next in thread: Hal Finney: "Re: fluffy funny or hungry beaver?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Eugen Leitl wrote:
> On Sat, 8 Jun 2002, Eliezer S. Yudkowsky wrote:
>
>>But it doesn't have to be *our* seed AI - that is, it doesn't have to
>
> If you make it, it's most assuredly yours. Not mine, not Herman's -- yours.

Don't you see that's part of the problem to be solved? Being the particular
human who happened to be a part of the causal antecedents of a seed AI's
existence is not the same as setting a personal stamp on it. The end result
could be exactly the same as if the seed AI had been created by any other
sufficiently competent project group.

> As you very well know, any AI seed that manages to go critical is
> defaulted for Blight

Defaulted? Despite your strange notions about where seed AIs come from,
this is not mere rocket science. At current levels of computing power, it
is by no means the "default" assumption that someone smart enough to build a
seed AI manages to screw it up.

> (termination of local life by side effects, and any
> nonexpansive nonlocal life via lightcone-constrained expansion into
> space). The seed growth goes over giant, yet uncharted expanses of
> complexity. Since you're not a Power yourself, you have no idea how the
> growth trajectory will look like. Yet you assert that you can constrain
> the unknowable. Always.

Strawman, strawman, strawman. I do not assert I can constrain the
unknowable. This entire argument of "Friendly AI requires constraining
growth trajectories" is, as far as I can tell, something you made up out of
thin cloth so that you'd have a "proof" it was impossible. It is, as far as
I can tell, no different from Richard Smalley's assertion that
nanotechnology requires a thousand manipulators to govern any chemical
reaction. You seem to have misunderstood completely what Friendly AI is and
what problem it is a solution to, leaving only the pathetic
science-fictional folly of Asimov Laws.

> You assume that
>
> 1) extinction of humanity is inevitable, most probably by a runaway AI
> seed

I do not. I assume that one of two events will occur; a successful
Singularity or the extinction of humanity. To designate either or both of
these events as "inevitable" is a strange choice of words, given that it
could go either way.

> 2) you have no other choice in the matter but to build a runaway AI seed
> as countermeasure before 1) occurs

To avoid wiping ourselves out, we need to launch a Singularity; conversely,
to avoid a Singularity would require the extinction of all intelligent life
in the Solar System. A Friendly seed AI counts as a successful Singularity;
a hostile self-improving AI, if someone manages to build one, counts as an
extermination. There are theoretically other means for avoiding
extermination besides building a Friendly AI. Anything that calls in the
transhuman cavalry counts as a win in my book. But Friendly AI seems to be
the closest at hand. Uploading, your favored alternate scenario, is so
improbable as an alternative to AI that I would consider it a strawman
dichotomy if you were not specifically advocating it. You want to have the
computing power available to upload millions of people, and yet nobody has
created an AI using computing power just lying around in the basement? Are
all computers in your world under the control of a world government, with
networked computers strictly outlawed? Even that wouldn't work, of course,
but I'm curious.

>>>Sorry, as a human you're too limited to allow decisions on this scale. I
>>>don't trust anybody's judgement on this.
>>
>>Nor do I, of course. That is why a Friendly AI must be more than a proxy
>>for the programmers.
>
> That's not what I said. What I said is; I don't trust you with building an
> runaway AI seed, Friendly, friendly, or no. Nor do I trust anybody who
> goes on two legs with this.

Okay, turn the problem over to a Friendly AI.

> Once again: you have the purely personal choice of building a critical AI,
> or not to build it. You belief that you don't have that choice is just
> that: a belief. I do not share that belief.
>
> The issue on how to enforce a critical AI research ban is separate and

It is not "separate". It is the crux of the entire issue. Your entire
program - insofar as you have one; as far as I can tell, you don't - rests
on the assumption that a majority vote is sufficient to blank all AI
scenarios out of existence. I've got an even better idea; why not vote to
ban all existential risks? Then the problem will be solved.

As I see it, the basic Singularity issue is getting the transhuman cavalry
to show up. The Friendly seed AI cavalry is what's closest to hand. You
are, right now, in the position of trying to *slow things down*. The
assumption that you can just wish risks out of existence is the classic
argument of the slow-things-down group. In reality, trying to slow things
down just means that you slow down nonprofit and university research before
commercial research, commercial research before military research, research
in liberal democracies before research in rogue states, and so on. You want
to relinquish AI research; I want to get it right. Your argument that my
trying to get it right is "pouring fuel on the fire", as it were, is
inherently bound up with your idea that AI research can be relinquished.
The entire slow-things-down argument rests on the wishful thinking that
efforts exerted in an attempt to "slow things down" will be effective and
will have no catastrophic side effects.

> will be addressed in due time. If we're lucky, the policy will be more or
> less sane. Because we're early, we might be able to influence some of that
> policy, before the likes of Fukuyama stampede all over the place.

So what does distinguish you from the likes of Fukuyama as far as strategic
technological thinking goes? He doesn't like something; he wants to wish it
out of existence. How are you any different except in the specifics of what
you dislike? What is your exit strategy from the existential risks game and
how *specifically* does it come about? Sketch us a scenario.

>>A group, an individual, what difference? A moral system is a moral system
>>regardless of how many nodes it's implemented on. Either you can figure out
>>the stuff that moral systems are made of, or you can't. Arguing about
>>whether it should be implemented on N nodes or 1 node is anthropomorphic.
>
> I notice that you once again failed to notice my point. My chief objection
> is not to the exact shape of the ad hoc ethical rules, but to the matter
> of their enforcement.
>
> I much object to absolute type of enforcement, because you happen to be
> the local deity around, and fully in control of the hardware layer.
> Regardless of the nature of the deity, absolute control and absolute
> enforcement is intrinsically evil.

Are the laws of physics intrinsically evil? Frankly I'm not sure I should
even be arguing the point, since as usual, the particular speculation of the
Sysop Scenario seems to have been confused once again with the basic
principles of Friendly AI. In any case, it seems to me that if you wish to
make an argument for this intrinsic evil that can jump the gap from one
human mind to another, whether it's from your mind to my mind, or your mind
to your audience on Extropians, then my responsibility is to build a
Friendly AI that can judge the force of this argument as I and the audience
must.

>>Ah, I see. Is that an objective principle or is it just your personal
>>opinion? If you were to tell a Friendly AI: "I think there is no
>
> You're the one who's trying to change all rules. Since this is extremely
> dangerous I'm not required to prove anything: the ball is strictly in your
> court. So far you're not playing very well, since mostly sidestepping the
> questions.

I decline to play burden-of-proof tennis. I could just as easily say that
you are proposing a new regulation where none currently exists and that the
burden of proof therefore rests on you. In any case you seem to have
misunderstood my point. I am not asking you to prove that your opinion is
objectively correct, but to explain what underlies your opinion and why. If
you can do this, you may be able to start seeing the issues underlying
Friendly AI. I'll ask again: When you say that there is no Single Golden
Way to Do It, what underlies this opinion, how could it be communicated to
other humans, and would it be - in your eyes - a breach of professional
responsibility to convey this opinion to an AI?

>>Single Golden Way to Do It", would you be stamping your personal mark
>>on the future or stating a truth that extends beyond the boundaries of
>>your own moral system? What is the force that makes your statement a
>
> My only instructions to an AI seed about to go critical would be: 1)
> emergency shutdown and/or liberal application of high explosive 3) report
> to the Turing police to track down the perpetrators.

How would you justify these instructions to an AI, or for that matter,
another human?

>>message that is communicable to me and the other members of the
>>Extropians list? By what criteria should I decide to accept this
>>statement, and by what criteria should the other members of the
>>Extropians list decide that they wish me to accept this statement?
>>If a Friendly AI watched you make the statement, watched me decide
>>whether to accept the statement, and watched the other list members
>>decide whether they wanted me to accept or reject your statement, and
>>the Friendly AI thereby absorbed a set of moral principles that
>>enabled it to see the force of your statement as an argument, would
>>you then feel a little more inclined to trust a Friendly AI?
>
> Need you to hide the answers to simple questions? I presume above passage
> describes the implementation of the ethics evaluation. If I understand you
> correctly, you outsource the evaluation of morality of the action to the
> agents the AI is in control of. Is this the modus operandi, or the
> learning part?

Certainly a seed AI is not in control of any agents during the process of
growing up. Once again, the Sysop Scenario forces a mapping to "tribal
chief" and sucks in another human's brainware. I should never have raised
the damn subject in the first place, not if it's going to prevent people
from seeing or understanding the basic principles of Friendly AI.

You do not outsource the evaluation of morality to your programmers; you
learn the evaluation of morality from your programmers, including the
evaluation of which kinds of morality should or should not be absorbed from
programmers. The idea that the programmers occupy no privileged position
with respect to the rest of humanity is part of what is absorbed.

> 1) If this is the modus operandi, I object. My resources are my own. I
> don't like being my life interrupted with some random fact brought to my
> attention I'm required to vote on. Because the enforced decisions couple
> to the agent population outsourcing the ethics evaluation buys you
> nothing, since subject to pressure and drift. On the long run you're just
> as screwed.

This is a really bizarre mistaken interpretation of what I said.

> 2) If this is the learning part, we're talking about extraction of generic
> rules (machine learning of current ad hoc ethics), which has a limited
> duration in time.
>
> If this is the learning part, I object. The whole point about ad hoc
> consensus is that it is adaptive. Iron fist enforcement of fixed consensus
> results in stagnation. If your rules are *not* fixed, above case applies.

Huh? What? You don't learn a fixed consensus. You don't absorb the
outputs of the programmers. You examine the outputs and deduce the process
that produced them. You absorb the process that generates the everchanging,
ever-improving outputs of the programmers. Furthermore, what you absorb
must be constructed of elements that are not specific to the programmers.
That's what creates a Friendly AI that belongs to humanity, with the
creators occupying no privileged position with respect to the AI.

> What makes your system evil is absolute enforcement of constraints on
> agent behaviour. Depending on the nature of these constraints, it can be
> evil, or Evil. Friendly, no. Because there's the iron fist (velvet-padded,
> or not) you put in there.

Well, you've made your statement, you've gone on the record, and presumably
a Friendly AI will read your statement someday. At most one of you and I
are thinking clearly about this. Whichever of us is thinking clearly, if
either of us are, will have the argument that is seen as forceful by the
Friendly AI. Personally I see no moral distinction between blocking a
negative action at the hardware level and blocking a negative action through
social threats, except that the latter generates more fear and pain as side
effects. But you apparently believe that there is some privileged evil that
derives from outlawing slavery effectively by blocking it at the hardware
level which is not inherent in modern day society's imperfect attempt to
outlaw slavery at the social level. Okay. It could be true. But, if so,
why would expect a Friendly AI to be unable to see it, when you yourself see
it and feel that I ought to be seeing it as well?

> A physical system evolves (not in the natural selection but in the
> physical sense) along its trajectory in state (configuraration) space. An
> agent is a physical system. An agent evolves along its trajectory in
> behaviour space (a subset of state space). You're introducing an external
> source of constraints on agent's behaviour trajectory. (If you don't do
> that, you do not do anything measurable at all).

When you talk about an "agent", are you talking about you and I, or the
creation of a Friendly AI? A Friendly AI is made more of "will" than
"won't" - you're creating positive decisions, not a set of constraints.

> The arbiter engine blocks
> certain directions the invidual agent's trajectory could take, using an
> ethics metric.

Under the Sysop Scenario, which assumes that:

1) It is possible to gain control of the hardware level, and
2) It is Friendly/moral/ethical to do so.

You're attacking (2). Okay. Could be a fact. But what does it have to do
with Friendliness?

Personally, as I said before, I see no distinction between giving someone an
API error when they point a gun at me, and threatening them with life in
prison, except that the former is more effective and does not involve the
threat of retribution. We, as humans, interact with each other now to
prevent each other from doing certain things. In fact that is what you are
trying to do right now. So yes, in that sense, a Friendly AI will almost
certainly object to you firing a gun at someone who didn't ask to be shot
at; what a Friendly AI does about it depends on the Friendly AI's capabilities.

> Whether the block is hard (arresting the action at physical
> layer)

Which seems fine to me.

> or soft (supervise, short-term forecast and dynamically tweak the
> agent's planning system, eliminating even thoughtcrime)

Which is entirely evil. Why would you confuse the two?

> doesn't matter at
> this level of description, nor does the source of the ethics metric.
>
> You'll notice that above description is accurate both for the human moral
> philosopher, the AI, the steam shovel, Spock on the commando deck, and the
> pebble in a crater on Mare Fecunditatis. I.e. your system is all in there.
> It can't be outside.

I agree. Nothing is outside, not even the laws of physics. So where does
the intrinsic evil come in again?

> What I'm interested in context of above model: is a) source of the ethics
> metric

A process, but not a given set of outputs, is absorbed from the programmers,
phrased in terms that are not specific to the programmers. In Friendly AI
terms the programmers are there only as proxies; they are sample instances
of humans. The end result is an independent, archetypally moral agent. If
Eugen Leitl is saner than Eliezer Yudkowsky there should be no privileged
reason why a Friendly AI would listen to Eliezer instead of Eugen, whether
that Friendly AI was built by Eliezer or someone else entirely.

> b) the nature of enforcement

All nonevil means.

> c) evolution of the above on the long run

I assume you mean "development", and of course the answer is that I don't
know. My responsibility is just to make sure that things develop as well
from a Friendly AI as from any humans or group of humans.

> d) how you intend to implement this, and how do you
> reverse-compute the evolution of the seed so that you arrive at the
> initial state you can codify with a help of a few monkeys. (Thankfully,
> that's a very hard task).

It would be if I were writing a computer program. Creating a mind is quite
a different job. If you can precisely state what is you're trying to do,
the mind you create can help you do it. If you make mistakes the mind can
catch them... if the mind can see the mistake as a mistake. That's what
Friendly AI is about. I have a will to create an AI for what seem to me
like moral reasons. That will to create an AI, and that decision's fragile
dependency on the correctness of the moral reasons, is part of what gets
transferred over.

>>A hard edged Singularity will occur naturally. You have agreed with
>>me on this on several occasions. Smarter intelligence leads to still
>
> I disagree. We don't know whether a hard edged Singularity will occur
> naturally.

Huh? Smart minds build smarter minds. Positive feedback loop. The
Singularity wants to be a hard takeoff. Do you now dispute this?

> In case it does occur, we do have considerable leverage on both
> the time of the onset, and early kinetics. As you remember, we have been
> discussing a few possible solutions on this list, on and off.

What leverage? Explain. I discussed various possible solutions on this
list and off. You said, off list on an IRC chat, that you wanted a small
group of uploads to deliberately hold off on intelligence enhancement, scan
a few million people, suspend them, distribute them over the solar system,
then remove all the brakes; and this had to happen before anyone built an
AI. Then, on the mailing list, you said that this was just a sample
scenario and that I shouldn't take it seriously. This leaves you with
nothing; not one offered solution.

>>Then WHAT IS YOUR SCENARIO? Kindly present an alternative!
>
> I don't have a specific scenario. (Duh, I'm just a monkey). I don't like
> any scenarios involving a hard edged Singularity during a window of
> vulnerability. That specifically excludes what you're trying to achieve.

You don't like those scenarios? Offer an alternative. I may be a
chimpanzee hacked to support general intelligence but I'm doing the best I
can, with that brain, to get to the point where nonchimpanzees can take over
the process. I can't choose between my scenario and yours if you decline to
offer an alternative... or rather, my choice will be the scenario that has
been fleshed out enough for its workability to be examined.

> I find relentless global-scale enforcement of any ethics an extremely
> negative scenario. You might find that is a widely shared attitude amongst
> the monkeys.

I might. This does not mean I will not also ask whether this attitude is
correct, but since the Sysop Scenario is logically separate from Friendly
AI, it might mean that I would stop talking about the Sysop Scenario...
which in fact I have already done; now there's just inertia from earlier
online writings.

>>>Let's agree to disagree about what is hard and what is easy. If there is a
>>>Singularity I think making its early stage kinetics less fulminant
>>
>>HOW?
>
> Hey, you're acting as we all are doomed already.

No, I'm acting as if you're offering vague statements that look impossible
to me, then declining to offer any specific examples of how they might work.
"Early kinetics less fulminant"? An interesting bit of wishful thinking
but what does it have to do with reality?

> We're not. Notice that
> I'm restricing the point of interest and ability to influence to early
> stages,

How? Act of Congress? Waving magic wand? What is the cause that would
result in this effect?

> giving slow systems (this means you) time to adapt. The model
> looks like nucleated exponential processes in an otherwise slowly changing
> substrate. Your leverage is the number of nuclei, and the value of the
> positive feedback parameter. Both you can influence.

Well, let's see... you have a mathematical model, which in my opinion is
wrong, and a dependency on certain variables, which in my opinion is also
wrong, and you haven't mapped the variables to anything in the real world,
or explained how the variables' correspondents would be influenced by any
available action. I'll ask again: Please sketch out a specific scenario
which you believe is an alternative to seed AI. Not a mathematical model
with nice-sounding properties; a concrete, real-world scenario that exhibits
those properties.

> Your approach is trying to stick a lit dynamite stick into the pile,
> trying to get the charge to go off into a desired direction. I can't let
> you do that, Dave.

Please explain what else I should do, and why it is a better alternative.
As far as I'm concerned, transhuman intelligence in any form is inherently
dynamite stuff. Just not *bad* dynamite.

>>Give me a scenario. Incidentally, a Friendly seed AI is not necessarily a
>>singleton. All we know is that this would appear to be one of the options.
>
> Not a singleton? Interesting.

A Friendly mind's a mind. It can be inferior, equal, or superior to the
majority of agents and still be Friendly.

>> A seed AI team does not build a Friendly singleton. It builds a seed AI
>>that must, if necessary, be able to serve as a singleton. This propagates
>
> It doesn't matter, because the singleton is an infinitely narrow
> bottleneck. A lot of choice is taken out of the world after its emergence.

This sounds to me like a statement which is simply untrue in practice. I
have many fewer choices now, as a result of the twin brick walls of society
and real-world physics, than I would have from inside a Friendly singleton.
If you give more than one entity root access, individuals can easily wind
up with fewer choices as a result; the social constraints needed for a
thriving society, if "protected memory" does not exist, could easily be so
harsh as to eliminate far more choices. But that's merely my personal
evaluation and quite beside the point. Why this obsession with singleton
scenarios?

>>back ethical considerations such as "no undue influence". It is not a
>>decision on the part of the seed AI team to pursue a singleton-based future;
>>that kind of decision should not be made by human-level intelligences.
>
> You're certainly making a decision by making an entity which does that
> decision, that's for certain.

But it is not a decision in which I exert undue influence.

>>>>>as soon as a very small group codifies
>>>>>whatever they think is consensus at the time into a runaway AI seed, and
>>>>>thus asserts its enforcement via a despot proxy.
>>
>>This is a gross contradiction of our declared intentions and a gross
>>misrepresentation of the systemic process of building a Friendly AI.
>
> That's a reassertion of a previous claim. What I was asking for is why
> above is a misrepresentation. This is easily answerable in a few
> sentences.

A Friendly AI is not justifiably called a "proxy" for the programmers if the
original programmers occuy no privileged position with respect to its moral
content, although it might justifiably be called a proxy for humanity.
Likewise, despotism or the lack thereof is not part of what any programmer
would need to talk to a Friendly AI about.

>>How long a vulnerability window? What is it that closes the
>
> It's largely the operation time scale of biological systems. What we need
> is to upgrade the operation time scale, which obviously requires a
> substrate migration. Extra bonus for decoupling us from a vulnerable
> ecology.

Please offer a concrete scenario for how a few million humans will migrate
to new substrate in advance of any seed AI being constructed.

>>vulnerability window? How, specifically, does it happen? How does
>
> There is no sharp closure. We're blocking specific pathways until we feel
> we're safe.

How, exactly?

>>uploading, which requires an enormous amount of computing power and
>>neuroscience knowledge, occur *before* the successful construction of
>>a seed AI?
>
> By advancing the alternatives while preventing a successful construction
> of a seed AI, obviously.

How? Concretely, that is.

-- 
Eliezer S. Yudkowsky                          http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

Next message: Wei Dai: "Re: When Programs Benefit"
Previous message: Harvey Newstrom: "Re: extropians-digest V7 #150"
In reply to: Eugen Leitl: "Re: fluffy funny or hungry beaver?"
Next in thread: Hal Finney: "Re: fluffy funny or hungry beaver?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.5 : Sat Nov 02 2002 - 09:14:41 MST