The Observer as a Classical Computer in a Quantum Universe

By Dr. Jacques Mallah

A new paper “What Does a Physical System Compute?” will soon be in place. The implementation criterion proposed in that paper is a refinement and extension of those in this paper.

Abstract:

Consequences of taking seriously the belief that the computational functioning of the human brain is responsible for conscious observations are explored in the context of quantum mechanics. This approach has the potential to result in an explanation of the quantum probabilities, without collapse of the wavefunction, in terms of the number of implementations of different computations. The result would be a computationalist wavefunction interpretation (CWI) with great advantages over any other interpretation of quantum mechanics. Obstacles to this goal are identified, particularly problems with the definition of the implementation concept, and proposals to overcome them are presented. A well-defined implementation criterion is required for any computationalist approach even for classical systems, so this issue is also relevant to the debate over whether an artificial digital intelligence should ever be considered conscious.

"Only the theory decides what one can observe." - Albert Einstein [8]

Introduction:

Are the aspects of the human brain that could be described as computational responsible for consciousness? Most physicists would probably say yes. This proposition, which is often discussed by philosophers, is perhaps best justified by the implausibility of any alternatives. As David Chalmers has pointed out [1], even if one were to hold the belief (known as dualism) that special, non-mathematical laws of nature are needed to allow consciousness, computationalism would still be the simplest plausible way to identify physical systems with the property of consciousness. Clearly, the proposition is at least worthy of study, not least because this belief would help determine the role of an artificial intelligence in society if such is constructed in the future.

In this paper, some consequences of the assumption of computationalism will be explored, especially regarding the implications for interpretation of quantum mechanics. It will be assumed that a physical system that performs a computation of the proper sort acts as a conscious observer. The question will then be asked whether a system that obeys the Shrodinger equation, without collapse of the wavefunction, would give rise to an ensemble of observers such that the predictions of quantum mechanics are reproduced as far as a typical observer could tell.

This would be a particular kind of many-worlds interpretation (MWI), to be called the computationalist wavefunction interpretation (CWI). One major difference compared to the standard MWI, Everett’s relative state interpretation (RSI), is that the derivation of the effective probabilities would be properly justified.

Ironically, there is a serious difficulty, which must be addressed, but it is not due to the use of quantum mechanics. The problem lies with computationalism itself. Even if classical physics held true, the problem would arise of identifying a physical system as implementing, or not implementing, a particular computation.

The Implementation Problem:

- A physical system is described by some set of physical states, labeled in some way, and laws that describe how the state of the system changes with time.

- A computation is described by some set of formal states, labeled in some way, and a transition rule indicating the next state that the machine must enter at the next clock step as a function of the current state.

If a physical system "implements" a particular computation, that computation should serve as a sort of partial characterization of the way in which the physical system functions. Computationalism holds that the aspects of the physical system responsible for consciousness can be characterized in such a way. Thus, the "implementation" concept is useful only if it captures the relevant ways in which a system functions.

Traditionally, a computation was thought to be implemented if there exists a mapping from the physical states to the formal computer states, such that as the physical system evolves in time, the formal states to which the physical states are mapped undergo the correct sequence. Furthermore, the causal requirement that the laws must be such that the relationship would hold true for any given physical initial condition that was mapped to any of the formal states is imposed, and each of the formal states must have at least one physical state mapped to it.

The problem, which was first raised by Searle but was most precisely stated by Chalmers [2], is that even a very simple physical system can be mapped to any arbitrary sequence of formal states, satisfying the above requirements but not performing the computation in any meaningful sense. A simple example is a system containing only a clock and a dial:

Clock and Dial "Implementation":

For a fixed state of the dial, the states of the clock are mapped to the formal states in the desired sequence. Other states of the dial can be used to allow as many independent sequences of states as are needed to make sure that all formal states can be used.

This show that under the "traditional definition of implementation", all computations would be implemented equally.

At this point there are four possibilities, of which more than one could be simultaneously considered:

  1. Computationalism could be abandoned, perhaps in favor of something that relies on the properties of specific physical substances.
  2. Digital computationalism could perhaps be abandoned in favor of something that relies on the properties of continuous systems.
  3. The idea of a physical world could be abandoned, and (to see if computationalism is still viable) one could ask whether a ‘typical’ computation of the conscious variety, within the set of all computations itself, might appear to that observer to see a universe such as ours. [3]
  4. Finally, one could try to restrict the definition of an implementation so as to rule out mappings that do not properly characterize the system in question.

Here I assume the only the latter, a "conservative" approach. Chalmers [2] attempted to solve this problem by defining a combinatorial state automaton (CSA), which has multi-component states, labeled by more than one index (e.g. a string of bits). He argues that such extra structure is needed for consciousness anyway, as a Finite State Automation (FSA), defined as having single-component states, does not display branching behavior.

However, in order for the CSA states to be any more resistant to false implementations, an additional requirement must be imposed. He suggested that the value of each index must depend on the contents of a separate region of space. This defeats clock and dial implementations because, if each bit depends on a different clock or dial, the desired transition rules will not hold for many possible initial conditions.

This proposal is problematic, however. It still allows certain implementations that seem false. This is because the information about the other regions can be copied into each region for the next time step, thus voiding the safeguard against contrived mappings that use global information. This copying does imply that, at each time step, each region will contain all the relevant information about the states of all the other regions at the last time step - so the amount of information that must be stored must grow exponentially with time.

For example, consider as a toy model of a physical system an initial state [a,b,c] that transitions into a state [abc,abc,abc] (where a, b, and c may be strings of bits or the like, and the commas separate regions). (Here, abc signifies a concatenation of the strings.) The physical states may then be mapped to formal states which satisfy any desired transition rules.

The proposal also does not allow other implementations that seem valid (such as those in which the properties of spatially interpenetrating waves serve as bits.) In the context of quantum mechanics, it is not obvious how to use or generalize Chalmers’ condition. Also, it seems to place a special importance on position space above and beyond the mathematical properties of that space; in other words, it seems to explicitly propose a new law of nature that specifically relates computations to position space. Chalmers is aware that a more satisfactory criterion is needed. [4]

Shifting Mappings:


In order to accommodate certain difficulties, the mapping from physical states to initial formal states should not be required to be the same as that from physical states to final formal states. For example, suppose there are N bits along a circle, which "almost" implement some formal computation, except that the result is shifted along by one bit each time step. This would be a legitimate implementation if the initial and final mappings can be different. Rotation of an object composed of waves might be a case where such a condition is needed.

Of course, restrictions such as Chalmers’ (generalized to allow the different initial and final mappings) must be imposed on the mappings along with the causal requirement of correct state transitions for all initial conditions. This means that computations with permutations or negation of states, which are not dependent on the initial condition, are considered equivalent. For example, the computation with two bits (a,b) à (b,a), or even (a,b) à (a, not b) is considered the same as (a,b) à (a,b), but (assuming for the moment that Chalmers’ restriction is used) one such as (a,b) à (a, a xor b) is not considered the same as those.

Physical Variables:

One way to improve upon the excessive focus on spatial position in Chalmers’ proposal, though clearly not a full solution of the implementation problem, could be to require that each index depends on a separate set of "physical variables" (rather than a separate region of space). In classical particle mechanics, that would mean depending on a separate set of particles. In nonrelativistic QM, the analogous condition would be that each index depends on the amplitudes of a separate set of orthogonal states (or on a separate set of components of such amplitudes). This does imply the use of some fixed "natural basis" of quantum states.

For example, for a system of two quantum spins, where each spin has two states, there are a total of four complex amplitudes of orthogonal states, or eight "physical variables". One possible mapping would be to have four bits that index the formal states, where a bit is set to 1 if some physical variable exceeds a certain value and is set to zero otherwise.

For this example, the amplitude of the state |+,+> might determine the value of the first bit, and so on. The four bits here would depend on the amplitudes of |+,+>, |+,->, |-,+>, and |-,-> respectively. This is different from the expectation that each spin is like a bit, so it needs careful study; this strange property may be reduced with the use of partial overlap (see below). Likewise, for a system of N spinless quantum particles, each formal state might depend on the values of the wavefunction Ψ(x1,...,xN) in a separate fixed region of "classical configuration space".

Partial Overlap:

A further improvement could be to allow some overlap of the physical variables that determine each formal index. First, if the conditions on the variables involved in the overlap are the same for all the states, there should be no problem. The reason for this is to allow, for example, (assuming classical mechanics here) that a restriction could be imposed that no giant hammer is allowed to smash the computer between time steps, even though the particles in the hammer are not otherwise involved in the mapping. But allowing a global restriction on all physical variables would invite false mappings.

It seems that more overlap should be allowed. Two possibilities are:

I. Each index, i, should have some set of physical variables that does not determine the value of any other index, and which does affect the value of the index i. The index is allowed to depend, as well, on another set of physical variables that are also used by other indices.

II. Each pair of indices i,j should have some set of physical variables that does affect the value of one but not the other, and vice versa for a second set of variables. An index is allowed to depend, as well, on another set of physical variables that are also used by other indices.

For example, suppose the physical system is discrete and consists of four physical "bits": a, b, c, d. A mapping is to be made onto a set of two formal bits: A, B. One possible mapping is: A = (a OR b) AND NOT d, B = (a OR c) AND NOT d. This mapping is to be considered legitimate despite the overlap of dependence on "a" and "d". (With only two formal bits, proposal II reduces to I in this case.) This type of overlapping mapping may help with regard to finding a reasonable set of formal bits for a quantum system.

Outlaw mappings:

Of course, false mappings of the sort Chalmers’ proposal is vulnerable to are still possible. Now, only the information that not every bit is allowed to depend upon needs to be copied, and the amount of information stored need only grow linearly with time since the previously copied information could be globally available. Additional restrictions are needed.

One possibility is simply to disallow a mapping in which the physical variables that determine the value of a component of a formal state contain all the information needed to infer the values of the formal components which determine it at the previous time step. The exception of a component that is determined by only one component is allowed. For example, if the physical system state (a,b) à (ab,ab), where ab is a concatenation of a with b, then using the entire regions separated by commas to determine desired formal states (which would permit arbitrary transition rules) will not be allowed. If by contrast (a,b) à (b, a OR b) is what the physical system does, then the regions separated by commas can be used.

Properties Desired of Possible Solutions:

If this set of proposals is to be adopted, care must be taken that it rules out false implementations, but also that it does not rule out valid implementations that may be needed for human consciousness. Some additional considerations of a more general nature, for any proposed solution, are described below.

1) Intuitive characterization:

The mapping from physical to formal states for an implementation should be a characterization of the structure of the system. Intuitively, there seems to be a distinction between simple and true implementations, and complicated false implementations. It is possible that the proper framework capable of addressing the problem is that of algorithmic (Kolmogorov) complexity theory.

2) Definiteness:

It is desirable, although perhaps not necessary, that it be possible to definitely say that a system does, or does not, implement a particular computation. The alternative to this is to allow the measure of each computation (in the sense of an effective probability) to be very small for ‘wrong’ computations as compared to that of ‘right’ ones. In this context, measure is defined as a quantity such that the effective probability that a computation would be "c" is proportional to the measure M(c).

"Effective probability" is a concept most readily defined for conscious computations, in which case it is a value that serves the role that a probability would play if one could "randomly select" such a computation. In other words, the effective probability distribution is the ideal Bayesian probability distribution that an observer should best use to generate retrodictions in the absence of any other information. This is a generalization of the idea that M(c) is equal to the number of ways in which "c" is implemented; one would prefer to avoid the need for such a generalization because the concept of effective probability is easier to understand if it represents an actual ratio. However, for all proposals in this paper, the measure M(c) will indeed be proportional to the number of something, or rather it will be before some limit is taken.

For example, if the physical system were a brain, an example of one of the ‘right’ computations would be one a cognitive scientist might associate with that brain state, while a ‘wrong’ computation would be something that should be associated with a completely different brain state. With no definiteness, the problem is that even a clock and dial system is certainly allowed to implement all computations – although the effect of those computations on the measure in a realistic system might be small. In fact this is very closely related to the idea of replacing physics with the set of all computations. [3] It makes the most sense in the latter context since the effective probabilities in a clock and dial system with ‘radical interpretations’ (all possible mappings) allowed are essentially the same as those for the set of all computations.

3) Simulation principle:

Most computationalists believe the following:

If system A implements computation B, and computation B is a simulation of a system that implements computation C, then system A must implement computation C.

A discrete physical system, if it existed, would be exactly analogous to a computation. One can therefore say that one computation can implement another, based on the same implementation criterion as for a physical system. (The implicit assumption is that the set of computations that could be implemented by a discrete system is not empty.)

To make the simulation principle into a restriction on possible criteria for the implementation of a computation, it should then be stated as follows: If system A implements computation B, and computation B implements computation C, then system A must implement computation C.

This can also be used to extend a definition of implementation to include such ‘simulated implementations’. The second proposal for partial overlap (II) violates the simulation principle and may thus be a candidate for that.

4) Additivity of measure:

If a physical system would give rise to M1(c) implementations for each computation "c", and a second physical system would give rise to M2(c) implementations for each computation "c", then a physical system that contains a set of variables that behave as in system 1 and also a second set that behave as in system 2, should give rise to K (M1(c)+ M2(c)) implementations for each computation "c", where K is some constant independent of c. This property means that independent parts of a system should not affect each other.

The Problem of Size:

A disturbing possibility is that the measure of a brain (the # of ways in which it could be said to implement its conscious computation) could depend on the size or structure of the brain. It is conceivable, for example, that a creature could be constructed which thinks like a human, but whose brain does not implement its computation, as mathematically defined, in nearly as many ways. If this were known, such a creature would be at great risk of enslavement by regular humans. The opposite scenario would also be possible. It is to be hoped that measure is not sensitive to the details of the construction of a brain, but answering this question before artificial intelligences are constructed would be advisable.

Deriving the Effective Probabilities:

Suppose, hypothetically, that the problems in defining a unique and proper implementation criterion can be overcome. The next step in applying computationalism to quantum mechanics would then be to see if the appearance of the standard rules for probability could be derived using the wavefunction without collapse.

Such probabilities would only be effective probabilities, because there is nothing non-deterministic in the model to be considered. An effective probability in this context is defined as the ratio of the number of observations of a given type to the total number of observations. In this case, this will be a ratio of the number of implementations of a computation of a given conscious type to the number of implementations summed over all conscious types.

For example, consider an idealized case in which a man uses a Stern-Gerlach device to measure the spin of a spin-½ atom along the Z direction. If the normalized initial state of the spin was (a |+> + b |->), and the initial state of the uncorrelated remainder of the system was |ready>, by linearity the state of the system will become (a |+>|’up’> + b |->|’down’>), where in the state |’up’> the man measured spin up, and in the state |’down’> he measured spin down (along Z). In the many-worlds interpretation, this is taken to mean that both spin up and spin down are measured. The standard probabilities of |a|2 for spin up and |b|2 for spin down must be derived as effective probabilities for each of these two types of computation implemented by the man’s brain.

First, an argument must be made to the effect that the system does in fact have at least some relevant implementations of the appropriate computations. It is not enough to make one mapping, taking projections onto the Hilbert space to the formal computer states, because for a fixed mapping from physical to formal states, it is by definition impossible for the physical state to correspond to two different formal states. Thus one can not say that in a superposition like (a |+>|’up’> + b |->|’down’>), for a given mapping, that the formal computer is in both an "up-seeing" and a "down-seeing" formal state. Instead, it must be true that for some possible mappings that state is mapped to an "up-seeing" state, while for others it is mapped to a "down-seeing" state.

These mappings might have bits that depend on different physical variables, for example one could have bits that depend on the physical states Ψ(+, etc.) and the second could be independent of that but would depend on what the physical states Ψ(-, etc.) are doing. The details of how this would work remain to be worked out. A large degree of decoherence would probably increase the number of mappings with the correct formal state transitions compared to what it would be with only a small decoherence.

Note that, despite the assumption here that the natural or "real" spin separates the mappings cleanly that way, it is unlikely that the S-G apparatus would have been pointing along the exact direction of the "real" spin basis. Thus in the real basis, the first mapping would have to depend both on physical states with a "real +" spin, and on those with a "real -" spin. Because the S-G device causes a spatial separation of the spin’s components in the measured basis, however, that is not much of a problem in this example.

Appendix A: Absolute Algorithmic (Kolmogorov) complexity:

Some approaches use the concept of algorithmic complexity. For example, O’Rourke proposed, in effect, that the measure associated with an implementation be inversely proportional to the complexity of the mapping. [6] A better version might be exponential suppression of the measure as a function of complexity, much as in the algorithmic- complexity-based theory of induction, the measure of a string is exponentially suppressed by its complexity. [5] That would allow all systems to have conscious computations, but some would have much more measure than others would.

Unfortunately, defining complexity unambiguously is another important problem. The Kolmogorov complexity of a symbol string is defined as the length of the shortest program, for a particular Turing machine (TM), which will output that string.

(A related quantity, the measure m(s) of the string "s" defined as the fraction of randomly chosen programs for that TM which output the string, could be used instead. Randomly chosen programs are strings with an equal probability for each allowed symbol. The measure decreases exponentially as the length of the shortest desired program increases. This measure is not to be confused with the measure M(c) of a conscious computation "c", although one may entertain the idea of relating them. [3] (See Appendix B below.)

The problem is the dependence on the choice of Turing machine. For any particular string, the complexity could be very small for if a particular Turing machine is used, but large if another had been used. This ambiguity in the notion of complexity is not very important in most practical applications, because for two fixed Turing machines the difference in complexity does not grow to infinity as the complexity of the string goes to infinity, but approaches a (usually small) constant.

An absolute algorithmic complexity must be defined if it is to play any role in solving the implementation problem. One possibility is simply to fix a particular TM as the one to be used, but if the resulting theory is to have any fundamental (as opposed to merely practical) significance whatsoever, there must exist some non-arbitrary reason to choose that particular TM. It is certainly true that some TMs are more "complicated" than others are; TMs used in practice tend to be not very complicated, which helps explain why the differences are not usually of practical importance.

But it is not obvious what the one best "least complicated TM" is. One factor to consider is minimum state-symbol-product for TMs. [5] Another approach is that of Tromp [7]. Instead of fixing a TM, another possibility is to try to average over all possible TMs. The problem with this is that, because there are infinitely many TMs, the order in which the average is taken could play a role; choosing an algorithm to define the order would then take the role that the choice of TM had played, and is equivalent to choosing a TM. This possibility should be carefully investigated; perhaps an iterative averaging scheme would give a unique solution.

Appendix B: Everything

It is conceivable that all possible mathematical systems have "physical reality", in the same sense that the universe that implements us has such reality; this is called the All-Universes Hypothesis (AUH). [3,9] Indeed, belief in Occam’s razor (as made precise in the context of complexity theory [5] and adjusted to take anthropic selection into account) makes such a belief inevitable. [10] In such a context, computationalism could be maintained in either of two ways: as usual (if the usual way is indeed well defined), or by not allowing implementations. In the latter case, our conscious computations would simply be among those computations that have existence as mathematical structures in their own right. In this sense, it is possible to have computationalism without materialism.

An argument has been made that a typical conscious computation in such an ensemble could see a world roughly as complex as our own, where the measure distribution is based on an equal distribution of all possible TM programs. Defining such a measure distribution involves similar problems to defining an absolute Kolmogorov complexity. [11, 12]

References:

1. Chalmers, David. The Conscious Mind, (c) 1996. Oxford University Press.

2. Chalmers, David. "Does a Rock Implement Every Finite State Automaton?" Synthese 108: 309-33, 1996

3. Schmidhuber, J. "A computer scientist's view of life, the universe and everything." In C. Freska, M. Jantzen, and R. Valk, editors, Foundations of Computer Science: Potential-Theory-Cognition, volume 1337 of Lecture Notes in Computer Science, pages 201-208. Springer, Berlin, 1997.

4. personal communication

5. Li and Vitanyi, An Introduction to Kolmogorov Complexity and Its Applications, 2nd. ed., (c) 1997. Springer-Verlag.

6.  O’Rourke, Joseph.  "Kolmogorov Complexity as a Tool in the Philosophy of Mind".  Preprint.

7.  Tromp, John.  Kolmogorov Complexity in Combinatory Logic.  
http://www.cwi.nl/~tromp/cl/CL/CL.html

8. W. Heisenberg, Der Teil und das Ganze, R. Piper & Co., Munich (1969). Translation by G. Holton.

9. Tegmark, Max. Is ``the theory of everything'' merely the ultimate ensemble theory? Annals of Physics 270, 1-51

10. Mallah, Jacques. Is the AUH falsifiable? No. Everything-list archive:

http://www.escribe.com/science/theory/m1951.html  See also other posts in the archive. 

11. Everything-list archive: http://www.escribe.com/science/theory/

12. Schmidhuber, J. Algorithmic Theories of Everything, quant-ph/0011122