Re: [sl4] prove your source code

From: Matt Mahoney (matmahoney@yahoo.com)
Date: Mon Oct 06 2008 - 06:58:20 MDT


--- On Sun, 10/5/08, Rolf Nelson <rolf.h.d.nelson@gmail.com> wrote:

> A belated response to an interesting thread:
>
> On Mon, Jul 14, 2008 at 3:14 PM, Wei Dai
> <weidai@weidai.com> wrote:
>
> > Putting aside the issue of superrationality for now, I
> wonder if anyone
> > else finds it plausible that two dissimilar SIs can
> know each other's source
> > code. If we assume that they start out without such
> knowledge, but each
> > wants the other gain it, what can they do? Can one SI
> prove to another what
> > its source code is? Or, is there some other argument
> for why SIs might know
> > each other's source code (beyond "we
> don't know what SIs might be capable
> > of, so we can't rule it out")?
>
>
> I don't have a certain answer. You have a prover agent
> P who's trying to
> prove to an observer agent O that P's source code has
> goal G. To be more
> precise, O and P each start controlling their own region of
> space. P wants
> to prove that P's region of space is controlled by an
> agent that wants to
> achieve G.
>
> Maybe O asks to send a set of probes into P's space
> that will "sample"
> random regions of P's spacetime and confirm that
> they're occupied by mini
> AGI's that are actively enforcing G. O would worry that
> P might rewrite the
> probes with "everything checks out great" data
> before allowing the probes to
> return to O, so O would have to somehow ensure that the
> probes' integrity is
> maintained while outside the space that O controls. Perhaps
> the probes could
> contain a secret key that self-destructs (is impossible to
> read) if the
> probe is tampered with; if a probe comes back without the
> same secret key
> then O would know P destroyed the probe and constructed a
> fake copy loaded
> with false observations. Not sure how you could create such
> a probe; maybe
> quantum no-cloning theorems could be of assistance in
> constructing it.
>
> If there is a benefit to smuggling weaponry into your
> rival's space, then P
> might want to audit that the probes are harmless before
> allowing them into
> his space. That part seems easier; even if there isn't
> a way to scan a probe
> for weaponry without threatening its secret, probably you
> could still use
> fancy cryptographic tricks so that P can scan and blow up
> probes #5, 6, and
> 17, and then prove afterwards to O that P's decision to
> blow up those
> particular probes was pre-determined, and therefore that
> those probes
> weren't blown up because those particular probes
> stumbled upon Something
> They Weren't Supposed To See.
>
> Once O verifies that X percent of P's space is filled
> with G-friendly
> nano-AI's, then O probably won't worry that some
> anti-G time-bomb is hiding
> in some corner of P's space, since the rest of the
> G-friendly mini-AI's
> wouldn't *want* such a time bomb to continue existing
> and would search for
> and destroy such a time bomb if there was a chance of it
> existing.
>
> -Rolf

I think you need to define more precisely what it means for an agent to "know" another's source code. If you mean that agent A can simulate agent's B program such that for any input x, B can predict A(x), then it is not possible for A and B to know each other's code. If A knows B, then K(A) > K(B), where K is Kolmogorov complexity. Therefore it is not possible for B to also know A, which would imply K(B) > K(A). The best you can do is a probabilistic model that allows for some chance of error in predicting either A(x) by B or B(x) by A.

By similar reasoning, an agent cannot predict its own actions with certainty, which would imply K(A) > K(A).

-- Matt Mahoney, matmahoney@yahoo.com



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:03 MDT