Abstract

We describe the TinyRAM architecture: a RISC random-access machine with a Harvard architecture and word-addressable random-access memory.

The Succinct Computational Integrity and Privacy Research (SCIPR) project constructs mechanisms for proving correct execution of TinyRAM programs, and TinyRAM is designed for efficiency in this setting.

In particular, TinyRAM strikes a balance between two opposing requirements:

1. Sufficient expressibility to support short and efficient assembly code when compiling from high-level programming languages, and
2. Small instruction set, with instructions that are simple to verify by arithmetic circuits, resulting in efficient verification using SCIPR’s algorithmic and cryptographic mechanisms.

This specification greatly benefited from comments by the SCIPR Lab programming team: Ohad Barta, Lior Greenblatt, Shaul Kfir, Michael Riabzev, Gil Timnat, and Arnon Yogev.
1 Introduction

The need to efficiently express correctness of nondeterministic computations arises in various applications that utilize proof systems for achieving certain security properties. For instance, this need arises in zero-knowledge proofs, probabilistically-checkable proofs, and others.

We describe the TinyRAM architecture, which is a random-access machine designed to be a convenient tool to efficiently express correctness of nondeterministic computations. TinyRAM is a reduced instruction set computer (RISC) with a Harvard architecture and word-addressable random-access memory. It strikes a balance between two opposing goals:

- Having an architecture that is expressive enough to allow for short and efficient assembly code obtained by compiling programs written in high-level programming languages; and

- Having an architecture that is minimalistic enough to allow for efficient reductions from the correctness of program computations to arithmetic circuit satisfiability (and other algebraic constraint satisfaction problems).

TinyRAM was introduced by Ben-Sasson et al. [BCG+13] in order to express correctness of nondeterministic computations in the setting of verifiable delegation of computations. For a discussion on the engineering choices behind the design of TinyRAM, see [BCG+13]; in this document, we focus on a precise specification of TinyRAM.
Illustrative application: succinctly verifying nondeterministic computations. We describe a simple example that motivates the need for expressing the correctness of nondeterministic computations in security applications.

Consider two parties, Alice and Bob, who respectively own inputs $x$ and $w$. Alice wishes to learn the correct output of an algorithm $A$ on input $(x, w)$, but does not want to incur the cost of computing the algorithm’s output $z := A(x, w)$. Specifically, Alice is only willing to run in time that is proportional to the length of her own input (i.e., $|x|$) and the length of the output (i.e., $|z|$) but is not willing to run in time that is proportional to Bob’s input (i.e., $|w|$) nor to the time needed to compute the algorithm’s output $z$.

If Alice trusts Bob (and Bob is indeed honest), then Alice’s efficiency requirement can be easily met as follows: Alice sends $x$ to Bob, then Bob computes $z = A(x, w)$ and sends $z$ to Alice. Alice thus learns $z$ without incurring the cost of computing $z$.

But what if Alice does not trust Bob? Can the efficiency requirements still be met? In such a case, Bob needs to convince Alice that his claimed computation’s output $\tilde{z}$ does equal the correct output $z$. In other words, after learning $x$ from Alice, Bob wants to convince Alice of the statement “there exists $w$ such that $\tilde{z} = A(x, w)$”. Using terminology from Theoretical Computer Science, such a statement is nondeterministic in the sense that Bob’s input $w$ is not fixed by the statement but rather is existentially quantified: Alice only cares that there exists some choice for Bob’s input that correctly produces the claimed output $\tilde{z}$. (The input $w$ is sometimes called a “witness” because it witnesses the fact that the output $\tilde{z}$ is a legitimate output of the computation.)

Crucially, Bob must use a very efficient method to convince Alice that the aforementioned nondeterministic statement holds, because Alice is not willing to compute $z$ from scratch. (In particular, Bob cannot simply send $w$ to Alice as “proof” that $\tilde{z} = A(x, w)$.) So Bob needs to use a cryptographic tool that is known as a proof system with succinct verification, which is a proof system that enables one party (the prover) to convince another one (the verifier) of the truth of a nondeterministic statement, while requiring the other party to invest resources that are proportional only to the nondeterministic statement’s size.

Being able to formally express nondeterministic statements is crucial for using such a proof system in practice, and TinyRAM can be used to efficiently express nondeterministic statements.

---

1 An important special case of this setting is when $w$ is the empty string; in such a case Alice wishes to enlist Bob’s help in computing the output $z$ of the algorithm $A$ when given her input $x$.

2 Note that, in the example mentioned above, the statement size is indeed proportional only to $|x|$ and $|z|$, as well as the size of the description of $A$, but is not proportional to $|w|$, or the time to compute $z$. 
2 Architecture Overview

TinyRAM (version 0.991) is parametrized by two integers: the word size, denoted \( W \), and the number of registers, denoted \( K \). (At times, to avoid confusion, we explicitly denote this by using the notation TinyRAM\(_{W,K}\).) The state of the machine consists of the following.

- The program, denoted \( P \). Following the Harvard architecture paradigm, \( P \) is stored in a separate, read-only, address space (i.e., different from the read-write data address space).
- The program counter, denoted \( pc \); it consists of \( W \) bits.
- \( K \) general-purpose registers, denoted \( r_0, r_1, \ldots, r(K - 1) \); each register consists of \( W \) bits.
- The (condition) flag, denoted \( flag \); it consists of a single bit.
- Memory, which is a linear array of \( 2^W \) words of \( W \) bits each.
- Two tapes, each containing a string of \( W \)-bit words; each tape is read-only in one direction. One tape is for a primary input \( x \) and the other tape is for an auxiliary input \( w \).

We treat the primary input as given, and the auxiliary input as nondeterministic advice. (See discussion about TinyRAM accepting computations in Section 3.)

The initial state of the machine is as follows: the contents of \( pc \), all general-purpose registers, \( flag \), and memory are all 0; the content of one tape defines the primary input and that of the other tape defines the auxiliary input.

The program \( P \) is a sequence of instructions. We shall specify the available instructions in Section 4; briefly, the instruction set of TinyRAM includes simple load and store instructions for accessing random-access memory, as well as simple integer, shift, logical, compare, move, and jump instructions. In particular, TinyRAM programs can efficiently implement control flow, loops, subroutines, recursion, and so on. Complex instructions, such as floating-point arithmetic, are not directly supported and can be implemented “in software” by TinyRAM programs.

At every time step, TinyRAM fetches and executes the \( pc \)-th instruction of \( P \).\(^3\) The only input to \( P \) is via the two input tapes, and the only output is via an answer instruction (which also terminates execution) that has a single argument \( A \), representing the return value. The return value \( A = 0 \) by default means “accept”. (If \( pc \) is not an integer in \( \{0, \ldots, L - 1\} \), where \( L \) is the number of instructions in \( P \), then the instruction answer 1 is fetched as default.)

See Figure 1 below for a diagram of TinyRAM.

---

\(^3\)We defined \( pc \) as a \( W \)-bit string. We use \( pc \) to also denote the corresponding integer between 0 and \( 2^W - 1 \).
3 Accepting Computations

In Section 1 we stated that TinyRAM is designed to be a convenient tool for expressing the correctness of nondeterministic computations. We now formalize what are the “good” nondeterministic computations on a TinyRAM machine.

We first introduce the notion of acceptance for a TinyRAM computation: a computation on TinyRAM is accepting (i.e., “good”) if execution halts with the instruction $\text{answer} \ 0$. More precisely, fix a word size $W$ and number of registers $K$, let $P$ be a TinyRAM$_{W,K}$ program, and let $x$ and $w$ be strings of $W$-bit words. We say that $P(x, w)$ accepts in $T$ steps if $P$, with $x$ as primary input and $w$ as auxiliary input, executes the instruction $\text{answer} \ 0$ in step $T$.

The set of accepting computations is $L = \bigcup_{W,K} L_{W,K}$, where $L_{W,K}$ is the set of triples $(P, x, T)$ where $P$ is a TinyRAM$_{W,K}$ program, $x$ is a string of $W$-bit words, and $T$ is a time bound, such that there exists a string $w$ of $W$-bit words for which $P(x, w)$ accepts in $T$ steps.

Now, given a nondeterministic computation, it is straightforward to encode it into a triple $(P, x, T)$ such that $(P, x, T) \in L$ if and only if the nondeterministic computation is correct. For instance, consider the statement “there is some $w$ such that $\tilde{z} = A(x, w)$” that we mentioned in Section 1; then consider the TinyRAM program $P$ that works as follows: given as input $((A, x, \tilde{z}), w))$, compute $z = A(x, w)$, and halt with $\text{answer} \ 0$ if $\tilde{z} = z$ and otherwise halt with $\text{answer} \ 1$. Clearly, if one is convinced that $(P, (A, x, \tilde{z}), T) \in L$, for some $T$, then one is also convinced that $A(x, w)$ outputs $\tilde{z}$ within $T$ steps for some choice of $w$.

Thus, the above is the precise sense in which we mean that TinyRAM is a convenient tool for expressing the correctness of nondeterministic computations.

---

4 See more details about the $\text{answer}$ instruction in Section 4.

5 We thus model all nondeterministic choices via a witness that is given as auxiliary input to the machine. In particular, reading the next $W$-bit word from the auxiliary input is a “nondeterministic instruction”.

6 Using time bounds is necessary for avoiding running into problems with the undecidability of the halting problem.
4 Instructions

The instruction set of TinyRAM consists of 27 instructions. Each instruction is specified via an opcode and up to three operands. An operand can be a register name (i.e., an integer between 0 and $K - 1$) or an immediate value (i.e., a $W$-bit string). Unless stated otherwise, every instruction increments pc (the program counter) by 1 (modulo $2^W$) and does not modify flag. Generally, the first operand is the destination register of the computation performed by the instruction, and the other operands (if any) specify arguments to the instruction. Finally, all instructions take one cycle of the machine to execute.

We now proceed to describe the instruction set of TinyRAM. The instructions are summarized in Table 1 below; following the table we describe each instruction in more detail.

Notations. In the following, $i$ and $j$ are integers in $\{0, \ldots, K - 1\}$; $ri$ is the $i$-th register, and $[ri]$ the $W$-bit string currently stored in it; similarly for $rj$ and $[rj]$. Also, $A$ denotes either an immediate value or a register name; $[A]$ denotes its value (i.e., the immediate value itself or the $W$-bit string currently stored in the register). At times we also need to consider unsigned and signed integers represented by a $W$-bit string. With this in mind, we denote by $[ri]_u$ the unsigned integer encoded by the contents of $ri$ (i.e., $\sum_{k=0}^{W-1} a_i 2^k$ if $ri$ stores the $W$-bit string $a_{W-1} \cdots a_0$) and by $[ri]_s$ the signed integer encoded by the contents of $ri$ (i.e., using two’s complement, $-a_{W-1}2^{W-1} + \sum_{i=0}^{W-2} a_i 2^i$ if $ri$ stores the $W$-bit string $a_{W-1} \cdots a_0$). The notations $[rj]_u$, $[rj]_s$ and $[A]_u$, $[A]_s$ are similarly defined. When these notations are used, arithmetic is performed over the integers. Finally, we use MSB and LSB to respectively denote the most-significant (left-most) and least-significant (right-most) bit of a binary string.
<table>
<thead>
<tr>
<th>instruction mnemonic</th>
<th>operands</th>
<th>effects</th>
<th>flag</th>
<th>notes</th>
</tr>
</thead>
<tbody>
<tr>
<td>and</td>
<td>ri rj A</td>
<td>compute bitwise AND of ([r_j]) and ([A]) and store result in (ri)</td>
<td>result is (0^W)</td>
<td></td>
</tr>
<tr>
<td>or</td>
<td>ri rj A</td>
<td>compute bitwise OR of ([r_j]) and ([A]) and store result in (ri)</td>
<td>result is (0^W)</td>
<td></td>
</tr>
<tr>
<td>xor</td>
<td>ri rj A</td>
<td>compute bitwise XOR of ([r_j]) and ([A]) and store result in (ri)</td>
<td>result is (0^W)</td>
<td></td>
</tr>
<tr>
<td>not</td>
<td>ri A</td>
<td>compute bitwise NOT of ([A]) and store result in (ri)</td>
<td>overflow</td>
<td></td>
</tr>
<tr>
<td>add</td>
<td>ri rj A</td>
<td>compute ([r_j]_u + [A]_u) and store result in (ri)</td>
<td>result is (0^W)</td>
<td></td>
</tr>
<tr>
<td>sub</td>
<td>ri rj A</td>
<td>compute ([r_j]_u - [A]_u) and store result in (ri)</td>
<td>borrow</td>
<td></td>
</tr>
<tr>
<td>mul</td>
<td>ri rj A</td>
<td>compute ([r_j]_u \times [A]_u) and store least significant bits of result in (ri)</td>
<td>overflow</td>
<td></td>
</tr>
<tr>
<td>umulh</td>
<td>ri rj A</td>
<td>compute ([r_j]_u \times [A]_u) and store most significant bits of result in (ri)</td>
<td>overflow</td>
<td></td>
</tr>
<tr>
<td>udiv</td>
<td>ri rj A</td>
<td>compute quotient of ([r_j]_u/[A]_u) and store result in (ri)</td>
<td>([A]_u = 0)</td>
<td></td>
</tr>
<tr>
<td>umod</td>
<td>ri rj A</td>
<td>compute remainder of ([r_j]_u/[A]_u) and store result in (ri)</td>
<td>([A]_u = 0)</td>
<td></td>
</tr>
<tr>
<td>shl</td>
<td>ri rj A</td>
<td>shift ([r_j]) by ([A]_u) bits to the left and store result in (ri)</td>
<td>MSB of ([r_j])</td>
<td></td>
</tr>
<tr>
<td>shr</td>
<td>ri rj A</td>
<td>shift ([r_j]) by ([A]_u) bits to the right and store result in (ri)</td>
<td>LSB of ([r_j])</td>
<td></td>
</tr>
<tr>
<td>cmpe</td>
<td>ri A</td>
<td>none (“compare equal”)</td>
<td>([r_i] = [A])</td>
<td></td>
</tr>
<tr>
<td>cmpa</td>
<td>ri A</td>
<td>none (“compare above”, unsigned)</td>
<td>([r_i]_u &gt; [A]_u)</td>
<td></td>
</tr>
<tr>
<td>cmpae</td>
<td>ri A</td>
<td>none (“compare above or equal”, unsigned)</td>
<td>([r_i]_u \geq [A]_u)</td>
<td></td>
</tr>
<tr>
<td>cmpg</td>
<td>ri A</td>
<td>none (“compare greater”, signed)</td>
<td>([r_i]_s &gt; [A]_s)</td>
<td></td>
</tr>
<tr>
<td>cmpge</td>
<td>ri A</td>
<td>none (“compare greater or equal”, signed)</td>
<td>([r_i]_s \geq [A]_s)</td>
<td></td>
</tr>
<tr>
<td>mov</td>
<td>ri A</td>
<td>store ([A]) in (ri)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>cmov</td>
<td>ri A</td>
<td>if flag = 1, store ([A]) in (ri)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>jmp</td>
<td>A</td>
<td>set pc to ([A])</td>
<td></td>
<td></td>
</tr>
<tr>
<td>cjmp</td>
<td>A</td>
<td>if flag = 1, set pc to ([A]) (else increment pc as usual)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>cnjmp</td>
<td>A</td>
<td>if flag = 0, set pc to ([A]) (else increment pc as usual)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>store</td>
<td>A ri</td>
<td>store ([r_i]) at memory address ([A]_u)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>load</td>
<td>ri A</td>
<td>store the content of memory address ([A]_u) into (ri)</td>
<td>(\leftarrow)</td>
<td>(1)</td>
</tr>
<tr>
<td>read</td>
<td>ri A</td>
<td>if the ([A]_u)-th tape has remaining words then consume the next word, store it in (ri), and set flag = 0; otherwise store (0^W) in (ri) and set flag = 1</td>
<td>(\leftarrow)</td>
<td>(2)</td>
</tr>
<tr>
<td>answer</td>
<td>A</td>
<td>stall or halt (and the return value is ([A]_u))</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

(1) All but the first two tapes are empty: if \([A]_u \notin \{0, 1\}\) then store \(0^W\) in \(ri\) and set flag = 1.

(2) answer causes a stall (i.e., not increment pc) or a halt (i.e., the computation stops); the choice between the two is undefined.

Table 1: Summary of the TinyRAM instruction set. Where “flag” is specified, flag is set to 1 if the predicate holds and to 0 otherwise. Below, we describe each instruction in more detail.
Bit operations. These are standard bit operations on registers.

and: The instruction \textbf{and} \texttt{ri rj} \texttt{A} stores in \texttt{ri} the bitwise AND of \texttt{[rj]} and \texttt{[A]}.
Moreover, flag is set to 1 if the result is \(0^W\) and to 0 otherwise.

or: The instruction \textbf{or} \texttt{ri rj} \texttt{A} stores in \texttt{ri} the bitwise OR of \texttt{[rj]} and \texttt{[A]}.
Moreover, flag is set to 1 if the result is \(0^W\) and to 0 otherwise.

xor: The instruction \textbf{xor} \texttt{ri rj} \texttt{A} stores in \texttt{ri} the bitwise XOR of \texttt{[rj]} and \texttt{[A]}.
Moreover, flag is set to 1 if the result is \(0^W\) and to 0 otherwise.

not: The instruction \textbf{not} \texttt{ri} \texttt{A} stores in \texttt{ri} the bitwise NOT of \texttt{[A]}.
Moreover, flag is set to 1 if the result is \(0^W\) and to 0 otherwise.

Integer operations. These are various unsigned and signed integer operations. In each case, the condition flag is set to 1 if an arithmetic overflow or an error (such as divide by zero) occurs, and is set to 0 otherwise. (Below, we shall specify, for each operation, the predicate that sets the flag.) In the sequel, \(U_W\) is the set of integers \(\{0, \ldots, 2^W - 1\}\); these are the \(2^W\) integers that can be encoded by \(W\)-bit strings. Similarly, \(S_W\) is the set of integers \(-2^{W-1}, \ldots, 0, 1, \ldots, 2^{W-1} - 1\); these are the \(2^W\) integers that can be encoded, via two’s complement, by \(W\)-bit strings.

add: The instruction \textbf{add} \texttt{ri rj} \texttt{A} stores in \texttt{ri} the \(W\)-bit string \(a_{W-1} \cdots a_0\) obtained as follows:
\(a_{W-1} \cdots a_0\) are the \(W\) least significant bits of \(G = [rj]_u + [A]_u\).
Moreover, flag is set to \(G_W\), where \(G_W\) is the MSB of \(G\).

sub: The instruction \textbf{sub} \texttt{ri rj} \texttt{A} stores in \texttt{ri} the \(W\)-bit string \(a_{W-1} \cdots a_0\) obtained as follows:
\(a_{W-1} \cdots a_0\) are the \(W\) least significant bits of \(G = [rj]_u + 2^W - [A]_u\).
Moreover, flag is set to \(1 - G_W\), where \(G_W\) is the MSB of \(G\).

mull: The instruction \textbf{mull} \texttt{ri rj} \texttt{A} stores in \texttt{ri} the \(W\)-bit string \(a_{W-1} \cdots a_0\) obtained as follows:
\(a_{W-1} \cdots a_0\) are the \(W\) least significant bits of \([rj]_u \times [A]_u\).
Moreover, flag is set to 1 if \([rj]_u \times [A]_u \notin U_W\) and to 0 otherwise.\(^7\)

umulh: The instruction \textbf{umulh} \texttt{ri rj} \texttt{A} stores in \texttt{ri} the \(W\)-bit string \(a_{W-1} \cdots a_0\) obtained as follows:
\(a_{W-1} \cdots a_0\) are the \(W\) most significant bits of \([rj]_u \times [A]_u\).
Moreover, flag is set to 1 if \([rj]_u \times [A]_u \notin U_W\) and to 0 otherwise.

smulh: The instruction \textbf{smulh} \texttt{ri rj} \texttt{A} stores in \texttt{ri} the \(W\)-bit string \(a_{W-1} \cdots a_0\) obtained as follows:
\(a_{W-1}\) is the sign of \([rj]_s \times [A]_s\) and \(a_{W-2} \cdots a_0\) are the \(W - 1\) most significant bits of the absolute value of \([rj]_s \times [A]_s\).
Moreover, flag is set to 1 if \([rj]_s \times [A]_s \notin S_W\) and to 0 otherwise.

udiv: The instruction \textbf{udiv} \texttt{ri rj} \texttt{A} stores in \texttt{ri} the \(W\)-bit string \(a_{W-1} \cdots a_0\) obtained as follows.
If \([A]_u = 0\), then \(a_{W-1} \cdots a_0 = 0^W\).
If \([A]_u \neq 0\), then \(a_{W-1} \cdots a_0\) is the binary encoding of \(Q\) where \(Q\) is the unique integer such that \([rj]_u = [A]_u \times Q + R\) for some integer \(R \in \{0, \ldots, [A]_u - 1\}\).
Moreover, flag is set to 1 if and only if \([A]_u = 0\).

\(^7\)An equivalent definition of the \textbf{mull} instruction is the following: “The instruction \textbf{mull} \texttt{ri rj} \texttt{A} stores in \texttt{ri} the \(W\)-bit string \(a_{W-1} \cdots a_0\) obtained as follows: \(a_{W-1}\) is the sign of \([rj]_s \times [A]_s\) and \(a_{W-2} \cdots a_0\) are the \(W - 1\) least significant bits of the absolute value of \([rj]_s \times [A]_s\). Moreover, flag is set to 1 if \([rj]_u \times [A]_u \notin U_W\) and to 0 otherwise.”
**umod:** The instruction `umod r i r j A` stores in `r i` the W-bit string \(a_{W-1} \cdots a_0\) obtained as follows.

If \([A]_u = 0\), then \(a_{W-1} \cdots a_0 = 0^W\).

If \([A]_u \neq 0\), then \(a_{W-1} \cdots a_0\) is the binary encoding of \(R\) where \(R\) is the unique integer in \(\{0, \ldots, [A]_u - 1\}\) such that \([r j]_u = [A]_u \times Q + R\) for some integer \(Q\).

Moreover, flag is set to 1 if and only if \([A]_u = 0\).

**Shift operations.** These are left and right (logical) shift operations.

**shl:** The instruction `shl r i r j A` stores in `r i` the W-bit string obtained by shifting \([r j]\) by \([A]_u\) bits to the left. The vacant positions (obtained after the shift) are filled with 0’s.

Moreover, flag is set to the most significant bit of \([r j]\).

**shr:** The instruction `shr r i r j A` stores in `r i` the W-bit string obtained by shifting \([r j]\) by \([A]_u\) bits to the right. The vacant positions (obtained after the shift) are filled with 0’s.

Moreover, flag is set to the least significant bit of \([r j]\).

**Compare operations.** These are various compare operations. Each of these instructions do not modify any registers; instead, the result of the comparison is stored in the condition flag.

**cmpe:** The instruction `cmpe r i A` sets flag to 1 if \([r i] = [A]\) and to 0 otherwise.

**cmpa:** The instruction `cmpa r i A` sets flag to 1 if \([r i]_u > [A]_u\) and to 0 otherwise.

**cmpae:** The instruction `cmpae r i A` sets flag to 1 if \([r i]_u \geq [A]_u\) and to 0 otherwise.

**cmpg:** The instruction `cmpg r i A` sets flag to 1 if \([r i]_s > [A]_s\) and to 0 otherwise.

**cmpge:** The instruction `cmpge r i A` sets flag to 1 if \([r i]_s \geq [A]_s\) and to 0 otherwise.

**Move operations.** These are standard move and conditional move operations.

**mov:** The instruction `mov r i A` stores \([A]\) in `r i`.

**cmov:** The instruction `cmov r i A` stores \([A]\) in `r i` if flag = 1. (If flag = 0, `r i` is not changed.)

**Jump operations.** These are standard jump and conditional jump operations. Each of these instructions do not modify any registers or the condition flag, but only modify the program counter.

**jmp:** The instruction `jmp A` stores \([A]\) in `pc`.

**cjmp:** The instruction `cjmp A` stores \([A]\) in `pc` if flag = 1. (If flag = 0, `pc` is incremented as usual.)

**cnjmp:** The instruction `cnjmp A` stores \([A]\) in `pc` if flag = 0. (If flag = 1, `pc` is incremented as usual.)

**Memory operations.** These are simple load and store operations where the address in memory is determined either by an immediate value or the contents of a register. These are the only addressing modes in TinyRAM. (In particular, the common “base+offset” addressing mode is not supported.)

**store:** The instruction `store A r i` stores \([A]\) into address \([r i]_u\) of memory.

**load:** The instruction `load r i A` stores the W-bit string at address \([A]_u\) of memory into `r i`. 
**Input operation.** This is the instruction to access contents to one of the two input tapes; the 0-th tape is used for the primary input and the 1-th tape is used for an auxiliary input.

**read:** The instruction `read ri A` stores in `ri` the next `W`-bit word on the `[A]_u`-th tape, if any. More precisely, if the `[A]_u`-th tape has remaining words then consume the next word, store it in `ri`, and set `flag = 0`; otherwise (if there are no remaining input words on the `[A]_u`-th tape) store `0^W` in `ri` and set `flag = 1`.

Because TinyRAM only has two input tapes, all but the first two tapes are assumed to be empty. Specifically, if `[A]_u` is not 0 or 1, then we store `0^W` in `ri` and set `flag = 1`.

**Answer operation.** This instruction signifies that the program has finished the computation and thus no additional operations are allowed.

**answer:** The instruction `answer A` causes the machine to stall (i.e., not increment `pc`) or halt (i.e., the computation stops) with return value `[A]_u`. The choice between stall or halt is undefined. A return value of 0 is used to indicate that the program accepted (see Section 3).
5 Assembly Language

A TinyRAM program $P$ is written in the TinyRAM assembly language, which we now describe. (The syntax is inspired by the Intel x86 syntax.)

A TinyRAM program $P$ is a text file consisting of a sequence of lines (separated by CR, LF or CR/LF). The text file is structured as follows. The first line contains the string

```
; TinyRAM V=1.00 W=W K=K
```

where $W$ is the word size in decimal representation and $K$ is the number of registers in decimal representation. Each subsequent line contains the following, in sequence:

1. Optional whitespace.

2. An optional label followed by “:”. This defines the label as referring to the first instruction following it (if any).
   A label must match the regular expression “\[0-9a-zA-Z_]+”. In particular, a label must start with an underscore (to distinguish the label from an immediate value and registers).

3. An optional instruction, consisting of an instruction mnemonic followed by its operands (if any).
   The instruction mnemonic is separated from the first operand by whitespace, and subsequent operands are separated by a comma (“,”) surrounded by optional whitespaces. Registers are specified as “r” followed by the register number in decimal, e.g., “r0”, “r12”. An immediate operand may be written as an integer in decimal representation, or as a label; an integer $a$ (which may be negative) represents the $W$-bit word $x$ such that $[x]_u \equiv a \mod 2^W$.

4. Optional whitespace.

5. An optional comment starting with a semicolon (“;”) and lasting until the end of the line.

Instructions are (implicitly) numbered sequentially, starting with 0 and ignoring non-instruction lines. A label may be defined at most once. Labels given as operands must be defined, and are resolved to the number of the instruction following the label definition.
6 Preamble

In the context of succinctly verifying nondeterministic computations \cite{BCGT13, BCG13}, we require TinyRAM programs to start with a specific preamble given below. This is needed for technical reasons, to improve the efficiency of reducing accepting computations (see Section 3) to circuit satisfiability (and other related problems).

**Definition 6.1.** We say that \( P \) is a proper TinyRAM\(_{W,K} \) program if it starts with the following instructions (where, for readability, we have made explicit the implicit instruction numbers):

0. \text{store} 0, r0
1. \text{mov} r0, 1
2. \text{read} r1, 0
3. \text{cjmp} 7
4. \text{add} r0, r0, 1
5. \text{store} r0, r1
6. \text{jmp} 2
7. \text{store} 1, r0

In other words, we only consider TinyRAM programs working as follows. First, the program stores \( 0^W \) in address \( 0^W \), and after that the program reads all of the primary input into memory (reading one word at a time, each time storing the word into the next available address starting from address 2, and finally storing in address 1 the address of where the last input word was stored).\(^8\) Afterwards, since an \( n \)-word input is stored in addresses \( 2, 3, \ldots, n+1 \), when the program wants to access a word of the primary input, it can do so by reading the suitable address in memory. (The program can learn the length of the input because address 1 contains the value \( n+1 \).)\(^9\)

---

\(^8\)Let us explain the code in Definition 6.1 in somewhat more detail. First of all, Instruction 0 is only a technicality (needed even if we require, as we do, that memory is initialized to “all zeros”), and the “interesting” part of the code is Instructions 1 through 7, which are the instructions responsible for reading the primary input. Register \( r0 \) stores a pointer to the last available address, while register \( r1 \) holds the current word read from the primary input tape (i.e., tape 0). Instruction 2 reads the next primary input word from the tape; if there is one, then it is stored in register \( r1 \) and the condition flag \( \text{flag} \) is not set; otherwise \( \text{flag} \) is set. Instruction 3 checks if \( \text{flag} \) is set. If \( \text{flag} \) is not set, the program proceeds to increase the counter \( r0 \) and then store in memory the new word from the input; then the program jumps back to Instruction 2 in order to try to read a new word from tape 0. Otherwise, if \( \text{flag} \) is set, the program jumps to Instruction 7 in order to store in address 1 the current value of \( r0 \), which holds address \( n+1 \) (where \( n \) is the number of words in the primary input) which is the address at which the last word was stored.

\(^9\)A program may, later, reuse these memory addresses. However, the primary input tape is fully consumed and cannot be read again.
7 Binary Encoding Of Instructions

Recall that an instruction is specified via an opcode and up to three operands. (See Section 4.) We now describe the binary encoding of an instruction. The binary encoding assumes that $6 + 2 \cdot \lceil \log_2 K \rceil \leq W$; this is the case for natural choices of $K$ and $W$. (E.g., $K, W = 16$ or $K, W = 32$ both work.)

An instruction is encoded via six binary fields, all using the little-endian convention.

- **Field #1.** This field stores the instruction’s opcode, which consists of $5 = \lceil \log_2 27 \rceil$ bits. (See Table 2 for the opcodes.)

- **Field #2.** This field is a bit that is 0 if $A$ is a register name and 1 if $A$ is an immediate value.

- **Field #3.** This field stores a register name operand, which consists of $\lceil \log_2 K \rceil$ bits. It is all 0’s when not used. This is the name of the instruction’s destination register (i.e., the one to be modified) if any.

- **Field #4.** This field stores a register name operand, which consists of $\lceil \log_2 K \rceil$ bits. It is all 0’s when not used. This is the name of a register operand (if any) that will not be modified by the instruction.

- **Field #5.** This field consists of padding with $0^{W - 6 - 2 |K|}$ so that the first six fields fit exactly in a string of $W$ bits.

- **Field #6.** This is either the name of another register (which is not modified by the instruction) or an immediate value, as determined by field #2. The length of this field is $W$ bits (which is the maximum between the length of a register name and of an immediate value).

Overall the instruction is encoded using $2W$ bits. See Table 2 for details on opcodes and field assignments.
Table 2: Binary encoding of TinyRAM instructions. The opcode in field #1 is written MSB-first. Field #2 is 0 if $A$ is a register name, and 1 if $A$ is an immediate value. Also, $|K|$ denotes $\lceil \log_2 K \rceil$, and $\langle \cdot \rangle$ denotes the binary representation of the argument, which is $|K|$ bits long for fields #3 and #4, and $W$ bits long for field #6.
References
