FW: Understanding CFAI

From: Smigrodzki, Rafal (SmigrodzkiR@msx.upmc.edu)
Date: Wed Mar 20 2002 - 16:50:18 MST

Next message: Smigrodzki, Rafal: "RE: BOUNCE extropians@extropy.org: Message in HTML"
Previous message: Colin Hales: "RE: The Patterns of Chaos"
Maybe in reply to: Smigrodzki, Rafal: "Understanding CFAI"
Next in thread: Eliezer S. Yudkowsky: "Re: Understanding CFAI"
Reply: Eliezer S. Yudkowsky: "Re: Understanding CFAI"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

> Subject: RE: Understanding CFAI part 2
>
>
> Me;
>
> > As a result the initial supergoals are
> > overwritten by new content (at least to some degree, dictated by
> the
> > ability to deceive others). As much as the imprint of my 4-year
> old self
> > in my present mind might object, I am forced to accept the higher
> > Kohlberg stage rules. Do you think that the Friendly AI will have
> some
> > analogue of such (higher) levels? Can you hypothesize about the
> > supergoal content of such level? Could it be translated back for
> > unenhanced humans, or would it be only accessible to highly
> improved
> > uploads?
>
> Eliezer:
>
> I'm not sure I believe in Kohlberg, but that aside: From the
> perspective
> of a human, an FAI would most closely resemble Kohlberg 6, and
> indeed
> could not be anything but Kohlberg 6, because an FAI cannot be
> influenced
> by threat of punishment, threat of disapproval, someone else's
> opinion, or
> society's opinion, except insofar as the FAI decides that these
> events
> represent valid signals about vis target goal content.
>
> Me:
>
> Good. I thought so too. But the question remains - will the FAI
> develop goals objectively following from the panhuman layer but not
> intellectually accessible to unenhanced humans? Would you foresee having
> to acquiesce to the FAI's moral guidance as an act of faith, in hope of
> eventually reaching its exalted plane (after gaining the equivalent of a
> couple hundred IQ points)?
>
> Eliezer:
>
> > > Humanity is diverse, and there's still some variance even in the
> > > panhuman layer, but it's still possible to conceive of
> description for
> > > humanity and not just any one individual human, by superposing
> the sum
> > > of all the variances in the panhuman layer into one description
> of
> > > humanity. Suppose, for example, that any given human has a
> preference
> > > for X; this preference can be thought of as a cloud in
> configuration
> > > space. Certain events very strongly satisfy the metric for X;
> others
> > > satisfy it more weakly; other events satisfy it not at all.
> Thus,
> > > there's a cloud in configuration space, with a clearly defined
> center.
> > > If you take something in the panhuman layer (not the personal
> layer) and
> > > superimpose the clouds of all humanity, you should end up with a
> > > slightly larger cloud that still has a clearly defined center.
> Any point
> > > that is squarely in the center of the cloud is "grounded in the
> panhuman
> > > layer of humanity".
> *
> * Me
> *
> > ### What if the shape of superposition turns out to be more
> complicated,
> > with the center of mass falling outside the maximum values of the
> > superposition? In that case implementing a Friendliness focused on
> this
> > center would have outcomes distasteful to all humans, and finding
> * alternative criteria for Friendliness would be highly nontrivial.
> *
> Eliezer:
>
> Well, what would *you* do in a situation like that?
>
> Me :
>
> Without batting an eyelid I would use my own goal system . :-)
>
> Eliezer:
>
> What would you want a Friendly AI to do?
>
> Me:
>
> Find a solution with the least amount of negative impact on the
> largest number of participants. E.g. allow splitting of humans according
> to their preferences, whether immortal uploading or a God-fearing
> existence as desert nomads, and disregard only those who are at odds with
> all others (as in dreaming of enslaving the world).
>
> Eliezer:
>
> It seems to me that problems like these are also subject to
> renormalization. You would use the other principles to decide what
> to do
> about the local problem with panhuman grounding.
>
> Me :
>
> I agree.
>
> Eliezer:
>
> If that's not the answer you had in mind, could you please give a
> more
> specific example of a problem? It's hard to answer questions when
> things
> get this abstract.
>
> Me:
>
> Let's say 80% of humans not only want to be simply left alone to do
> their God-fearing stuff but also insist on striking down the infidels. 15%
> want to upload and go to the stars, and they wouldn't lift a finger to
> prevent malignant nanos (not really dangerous to their highly evolved
> selves) from eating the unenlightened brutes down below. 5% want to be
> nice to everybody, don't want to kill the high-techs, but would prohibit
> self-replicating nano development if needed to assure the continued
> ability of low-techs to, you guessed it, do their stupid God-fearing
> stuff.
>
> I know you cannot foresee the results of an SAI's analysis of such a
> situation but I would think it might be difficult for any single human to
> understand and accept its suggestions as objective truth (except if by a
> lucky coincidence he/she happens to have superhuman idea(l)s already).
>
> Me:
> > ---
> > ### And a few more comments:
> > I wonder if you read Lem's "Golem XIV"?
> > Oops, Google says you did read it. Of course.
>
> Eliezer:
>
> I've read some Lem, but not that one.
>
> Me:
>
> Sorry, I just too quickly skimmed over and misinterpreted a post
> where Ben Goertzel dicussed Golem XIV in a reply to your post. I loved the
> story, especially gems like the machine explaining to vis pet human that
> the thought occupying vis mind at this time could be expressed in a
> human-understandable lecture lasting 135 +/- 5 years. Or the ultra-AI
> neutralizing Luddites by manipulating the likelihood of silly mishaps
> happening to would-be attackers.
>
> Me:
>
> > In a post on Exilist you say that uploading is a post-Singularity
> > technology.
>
> Eliezer:
>
> Yes, and I still hold to this.
>
> Me:
>
> > While I intuitively feel that true AI will be built well
> > before the computing power becomes available for an upload, I
> would
> > imagine it should be possible to do uploading without AI. After
> all, you
> > need just some improved scanning methods, and with laser tissue
> > machining, quantum dot antibody labeling and high-res confocal
> > microscopy, as well as the proteome project, this might be
> realistic in
> > as little as 10 years (a guess). With a huge computer but no AI
> the
> > scanned data would give you a human mind in a box, amenable to
> some
> > enhancement.
>
> Eliezer:
>
> The statement about post-Singularity technology reflects relative
> rates of
> development, not an absolute technological impossiblity
>
> Me:
>
> OK. I agree here but along with Eugene Leitl I wish that the
> situation were the opposite - if huge resources were immediately poured
> into uploading research, we could still welcome the singularity from the
> inside of Intel's boxes. Alas, this won't happen.
>
> Eliezer:
>
> Hopefully it will be pretty clear that
> the AI has "gotten" Friendliness and is now inventing vis own,
> clearly
> excellent ideas about Friendliness.
>
> Me:
>
> How do you verify the excellence of an SAI's ideas, and
> differentiate them from a high-level FoF?
>
>
> And on a sycophantic note:
>
> Probably the Singinst has the most important job in the world.
> Uploading won't happen before AI, and somebody better make it friendly, or
> else nobody will ever know what hit us.
>
> Rafal
>

Next message: Smigrodzki, Rafal: "RE: BOUNCE extropians@extropy.org: Message in HTML"
Previous message: Colin Hales: "RE: The Patterns of Chaos"
Maybe in reply to: Smigrodzki, Rafal: "Understanding CFAI"
Next in thread: Eliezer S. Yudkowsky: "Re: Understanding CFAI"
Reply: Eliezer S. Yudkowsky: "Re: Understanding CFAI"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.5 : Sat Nov 02 2002 - 09:13:03 MST