Re: fluffy funny or hungry beaver?

From: Eugen Leitl (eugen@leitl.org)
Date: Sun Jun 09 2002 - 06:23:28 MDT

Next message: Natasha Vita-More: "Re: A poem from late last night"
Previous message: scerir: "R: Spiking into Ancient Free Space"
In reply to: Eliezer S. Yudkowsky: "Re: fluffy funny or hungry beaver?"
Next in thread: Eliezer S. Yudkowsky: "Re: fluffy funny or hungry beaver?"
Reply: Eliezer S. Yudkowsky: "Re: fluffy funny or hungry beaver?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

On Sat, 8 Jun 2002, Eliezer S. Yudkowsky wrote:

> You asked your questions, I gave the answers, you kept on writing exactly
> the same things. If you don't believe my answers that's one thing, but
> don't accuse me of refusing to answer your questions.

Okay, I'll go through our past exchange (particularly, the last message I
haven't answered yet) and look at the answers.

> But it doesn't have to be *our* seed AI - that is, it doesn't have to

If you make it, it's most assuredly yours. Not mine, not Herman's -- yours.

> be a seed AI where anyone would or could look at it and say, "Hm, this
> seed AI was designed by a programmer who enjoyed reading Terry
> Pratchett novels." I confess to not liking scenarios where Earthgoing
> intelligent life is destroyed - but to steer around such catastrophes

As you very well know, any AI seed that manages to go critical is
defaulted for Blight (termination of local life by side effects, and any
nonexpansive nonlocal life via lightcone-constrained expansion into
space). The seed growth goes over giant, yet uncharted expanses of
complexity. Since you're not a Power yourself, you have no idea how the
growth trajectory will look like. Yet you assert that you can constrain
the unknowable. Always.

> does not, to me, seem to consist of exerting undue influence on the
> future. If you accuse me of intending to set my personal stamp on the
> future, making it in any sense "my" future rather than humanity's
> through abuse of the role I intend to play, then that is a very
> serious accusation and I should like to hear it backed up.

You assume that

1) extinction of humanity is inevitable, most probably by a runaway AI
seed

2) you have no other choice in the matter but to build a runaway AI seed
as countermeasure before 1) occurs

Do above two points represent your position accurately?

> > Sorry, as a human you're too limited to allow decisions on this scale. I
> > don't trust anybody's judgement on this.
>
> Nor do I, of course. That is why a Friendly AI must be more than a proxy
> for the programmers.

That's not what I said. What I said is; I don't trust you with building an
runaway AI seed, Friendly, friendly, or no. Nor do I trust anybody who
goes on two legs with this.

Once again: you have the purely personal choice of building a critical AI,
or not to build it. You belief that you don't have that choice is just
that: a belief. I do not share that belief.

The issue on how to enforce a critical AI research ban is separate and
will be addressed in due time. If we're lucky, the policy will be more or
less sane. Because we're early, we might be able to influence some of that
policy, before the likes of Fukuyama stampede all over the place.

> > The good part about enforcing consensus is that none of the players is
> > omnipotent. I'd rather not see a manmade god make a moral error of any
> > magnitude, thankyouverymuch.
>
> A group, an individual, what difference? A moral system is a moral system
> regardless of how many nodes it's implemented on. Either you can figure out
> the stuff that moral systems are made of, or you can't. Arguing about
> whether it should be implemented on N nodes or 1 node is anthropomorphic.

I notice that you once again failed to notice my point. My chief objection
is not to the exact shape of the ad hoc ethical rules, but to the matter
of their enforcement.

I much object to absolute type of enforcement, because you happen to be
the local deity around, and fully in control of the hardware layer.
Regardless of the nature of the deity, absolute control and absolute
enforcement is intrinsically evil.

> I wouldn't like to see a manmade god make a moral error of any magnitude
> either. What's your point?

My point is that you're trying to build that god. I don't. Many other
people don't. In fact there probably are about two people in the whole
world who advocate that publicly (and the general public doesn't listen
only because it thinks they're looniacs).

> > Morality is not absolute --> there is no Single Golden Way to Do It.
>
> Ah, I see. Is that an objective principle or is it just your personal
> opinion? If you were to tell a Friendly AI: "I think there is no

You're the one who's trying to change all rules. Since this is extremely
dangerous I'm not required to prove anything: the ball is strictly in your
court. So far you're not playing very well, since mostly sidestepping the
questions.

While I'm rather open-minded on the issue, you'll find that the rest of
the world has a low tolerance for people engaged in dangerous activities.
The only reason we don't have Butler's Jihad on our hands right now is
that the general public nor the establishment considers the risks to be
real. This is bound to change.

> Single Golden Way to Do It", would you be stamping your personal mark
> on the future or stating a truth that extends beyond the boundaries of
> your own moral system? What is the force that makes your statement a

My only instructions to an AI seed about to go critical would be: 1)
emergency shutdown and/or liberal application of high explosive 3) report
to the Turing police to track down the perpetrators.

> message that is communicable to me and the other members of the
> Extropians list? By what criteria should I decide to accept this
> statement, and by what criteria should the other members of the
> Extropians list decide that they wish me to accept this statement?
> If a Friendly AI watched you make the statement, watched me decide
> whether to accept the statement, and watched the other list members
> decide whether they wanted me to accept or reject your statement, and
> the Friendly AI thereby absorbed a set of moral principles that
> enabled it to see the force of your statement as an argument, would
> you then feel a little more inclined to trust a Friendly AI?

Need you to hide the answers to simple questions? I presume above passage
describes the implementation of the ethics evaluation. If I understand you
correctly, you outsource the evaluation of morality of the action to the
agents the AI is in control of. Is this the modus operandi, or the
learning part?

1) If this is the modus operandi, I object. My resources are my own. I
don't like being my life interrupted with some random fact brought to my
attention I'm required to vote on. Because the enforced decisions couple
to the agent population outsourcing the ethics evaluation buys you
nothing, since subject to pressure and drift. On the long run you're just
as screwed.

2) If this is the learning part, we're talking about extraction of generic
rules (machine learning of current ad hoc ethics), which has a limited
duration in time.

If this is the learning part, I object. The whole point about ad hoc
consensus is that it is adaptive. Iron fist enforcement of fixed consensus
results in stagnation. If your rules are *not* fixed, above case applies.

What makes your system evil is absolute enforcement of constraints on
agent behaviour. Depending on the nature of these constraints, it can be
evil, or Evil. Friendly, no. Because there's the iron fist (velvet-padded,
or not) you put in there.

> > Meaningless metalevel description, not even wrong. Tell me where you
> > derive your action constraints from to feed into the enforcer. <--- that's
> > a genuine, very answerable question.
>
> Not until you define what an "action constraint" is; as for "enforcer"
> it sounds distinctly un-Friendly-AI-ish. You seem to have a very

Actually, unless I misunderstood you, you answered the question in the
above.

> limited view of what a Friendly AI is, consisting of some kind of
> mathematical constraint set or Asimov Laws. Start with an analogy to
> a human moral philosopher instead of a steam shovel. It will still be
> a wrong analogy but at least you won't be using the Spock Stereotype
> to guide your vision of how AIs work. An Extropian futurist should
> know better.

Apparently, I have still failed to connect/be understood. I guess it's
mutual.

A physical system evolves (not in the natural selection but in the
physical sense) along its trajectory in state (configuraration) space. An
agent is a physical system. An agent evolves along its trajectory in
behaviour space (a subset of state space). You're introducing an external
source of constraints on agent's behaviour trajectory. (If you don't do
that, you do not do anything measurable at all). The arbiter engine blocks
certain directions the invidual agent's trajectory could take, using an
ethics metric. Whether the block is hard (arresting the action at physical
layer) or soft (supervise, short-term forecast and dynamically tweak the
agent's planning system, eliminating even thoughtcrime) doesn't matter at
this level of description, nor does the source of the ethics metric.

You'll notice that above description is accurate both for the human moral
philosopher, the AI, the steam shovel, Spock on the commando deck, and the
pebble in a crater on Mare Fecunditatis. I.e. your system is all in there.
It can't be outside.

What I'm interested in context of above model: is a) source of the ethics
metric b) the nature of enforcement c) evolution of the above on the long
run and d) how you intend to implement this, and how do you
reverse-compute the evolution of the seed so that you arrive at the
initial state you can codify with a help of a few monkeys. (Thankfully,
that's a very hard task).

> A hard edged Singularity will occur naturally. You have agreed with
> me on this on several occasions. Smarter intelligence leads to still

I disagree. We don't know whether a hard edged Singularity will occur
naturally. In case it does occur, we do have considerable leverage on both
the time of the onset, and early kinetics. As you remember, we have been
discussing a few possible solutions on this list, on and off.

> smarter intelligence; positive feedback loops tend to focus on their
> centers. Are you now retracting your agreement?

There was never an official agreement. I agree that a hard edged
Singularity is feasible in principle. I'm not producing any specic
forecasts as to how probable it is.

> Then WHAT IS YOUR SCENARIO? Kindly present an alternative!

I don't have a specific scenario. (Duh, I'm just a monkey). I don't like
any scenarios involving a hard edged Singularity during a window of
vulnerability. That specifically excludes what you're trying to achieve.

> Very well. I assume you stand by your previous statement that those
> who do not upload voluntarily will be dogfood. However, this is a

Godfood, rather. It is a possible outcome, yes. It also implies making an
informed choice. Notice that there is no iron fist involved here, despite
that this will probably result in objective casualties.

> perfectly defensible moral position, albeit a strongly negative
> scenario.

I find relentless global-scale enforcement of any ethics an extremely
negative scenario. You might find that is a widely shared attitude amongst
the monkeys.

> > Let's agree to disagree about what is hard and what is easy. If there is a
> > Singularity I think making its early stage kinetics less fulminant
>
> HOW?

Hey, you're acting as we all are doomed already. We're not. Notice that
I'm restricing the point of interest and ability to influence to early
stages, giving slow systems (this means you) time to adapt. The model
looks like nucleated exponential processes in an otherwise slowly changing
substrate. Your leverage is the number of nuclei, and the value of the
positive feedback parameter. Both you can influence.

Your approach is trying to stick a lit dynamite stick into the pile,
trying to get the charge to go off into a desired direction. I can't let
you do that, Dave.

> Give me a scenario. Incidentally, a Friendly seed AI is not necessarily a
> singleton. All we know is that this would appear to be one of the options.

Not a singleton? Interesting.

> A seed AI team does not build a Friendly singleton. It builds a seed AI
> that must, if necessary, be able to serve as a singleton. This propagates

It doesn't matter, because the singleton is an infinitely narrow
bottleneck. A lot of choice is taken out of the world after its emergence.

> back ethical considerations such as "no undue influence". It is not a
> decision on the part of the seed AI team to pursue a singleton-based future;
> that kind of decision should not be made by human-level intelligences.

You're certainly making a decision by making an entity which does that
decision, that's for certain.

> >>> as soon as a very small group codifies
> >>> whatever they think is consensus at the time into a runaway AI seed, and
> >>> thus asserts its enforcement via a despot proxy.
>
> This is a gross contradiction of our declared intentions and a gross
> misrepresentation of the systemic process of building a Friendly AI.

That's a reassertion of a previous claim. What I was asking for is why
above is a misrepresentation. This is easily answerable in a few
sentences.

> How long a vulnerability window? What is it that closes the

It's largely the operation time scale of biological systems. What we need
is to upgrade the operation time scale, which obviously requires a
substrate migration. Extra bonus for decoupling us from a vulnerable
ecology.

> vulnerability window? How, specifically, does it happen? How does

There is no sharp closure. We're blocking specific pathways until we feel
we're safe. From a certain point onward we should be able to address the
seed AI runaway in person, and as meaningful players.

> uploading, which requires an enormous amount of computing power and
> neuroscience knowledge, occur *before* the successful construction of
> a seed AI?

By advancing the alternatives while preventing a successful construction
of a seed AI, obviously.

Next message: Natasha Vita-More: "Re: A poem from late last night"
Previous message: scerir: "R: Spiking into Ancient Free Space"
In reply to: Eliezer S. Yudkowsky: "Re: fluffy funny or hungry beaver?"
Next in thread: Eliezer S. Yudkowsky: "Re: fluffy funny or hungry beaver?"
Reply: Eliezer S. Yudkowsky: "Re: fluffy funny or hungry beaver?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.5 : Sat Nov 02 2002 - 09:14:41 MST