# EE 224: RISC CPU

Mihir Kavishwar (17D070004) Rishabh Dahale (17D070008) Mithilesh Vaidya (17D070011) Anubhav Agarwal (17D070026)

April 2019

## 1 Overview

We have designed a simple CPU which can implement 14 basic general purpose instructions. These consists of addition, bit-wise NAND, control flow and reading, modifying and navigating through memory. The only input to the ALU is the clock and a reset input which initialises the required signals to it's starting value. The design of the desired general purpose computing system consists of the following components:

- 1. Read and Write Memory
- 2. Register File (RF)
- 3. Arithmetic Logic Unit (ALU)
- 4. Control Unit

We have implemented a finite state machine in the control unit which is responsible for the state transitions, output signals and controlling the various sub components.

# 2 Arithmetic Logic Unit

The ALU has two 16 bit inputs 'A' and 'B' and a two bit control input which determines what operation to perform. In addition to the computed 16-bit value, it also outputs three 1-bit signals:

- C flag set to 1 if addition results in overflow.
- Z flag set to 1 if the 16-bit output is zero.
- X flag set to 1 when ALU output is 8. It is used for the Load All/Store All instruction which has to be executed 8 times once for each register in the register file.

ALU performs the following functions:

- 1. ADD Addition of two 16-bit signed numbers with a carry out.
- 2. XOR Bit-wise XOR of two 16-bit numbers. This part is used to check if two input numbers to the ALU are equal (Useful in BEQ instruction to detect if the two inputs are the same).
- 3. Bitwise NAND operation

#### 3 Register File

The register file consists of eight 16-bit registers. The RF stores the data provided at the port RF\_D3 into location provided at port RF\_A3, at every rising edge of clock cycle.

The read operation has been implemented as a combinational logic and write operation is implemented as behavioral logic using process statements.

## 4 Control Unit(CPU)

This is the heart of our machine. It integrates the various sub-components (Memory, ALU and Register File). It also has temporary registers and MUXes that are used to combine the various sub-components into a single datapath.

We have implemented the finite state machine in this part using behavioural VHDL logic. The FSM has various control and enable signals for the subcomponents as its output. It takes the OpCode and values of C,Z registers as input in its output logic.

Our machine is of Mealy type with 30 different states. The machine is Mealy since the output (i.e control signals) depends on the input (i.e OpCode and C,Z) The operations that take place in each state are shown in the figures below. The specification of each instruction can be found here.









$$\begin{array}{c} \forall \mathsf{F} \mathsf{BF}(\mathsf{Q}^{\mathsf{I}} \\ S_{0} \rightarrow \overbrace{\mathsf{IP}}^{\mathsf{F}} \rightarrow \mathsf{ALU}_{-\mathsf{A}} \\ \exists \mathsf{R}_{3} \rightarrow \mathsf{SE}_{10} \rightarrow \mathsf{ALU}_{-\mathsf{B}} \\ \exists \mathsf{A}_{5,0} \rightarrow \mathsf{SE}_{10} \rightarrow \mathsf{ALU}_{-\mathsf{B}} \\ \exists \mathsf{A}_{0,0} \rightarrow \mathsf{T3} \\ \exists \mathsf{A}_{0,0} \rightarrow \mathsf{T3} \\ \exists \mathsf{A}_{0,0} \rightarrow \mathsf{T3} \\ \exists \mathsf{A}_{0,0} \rightarrow \mathsf{T1} \\ \exists \mathsf{A}_{0,0} \rightarrow \mathsf{A}_{0,0} \rightarrow \mathsf{A}_{0,0} \\ \exists \mathsf{A}_{0,0} \rightarrow \mathsf{A}_{0,0} \rightarrow \mathsf{A}_{0,0} \\ \exists \mathsf{A}_{0,0} \rightarrow \mathsf{T1} \\ \exists \mathsf{A}_{0,0} \rightarrow \mathsf{A}_{0,0} \rightarrow \mathsf{A}_{0,0} \\ \exists \mathsf{A}_{0,0} \rightarrow \mathsf{A}_{0,0} \\ \exists \mathsf{A}_{0,0} \rightarrow \mathsf{A}_{0,0} \\ \mathsf{A}_{0,0} \\ \mathsf{A}_{0,0} \rightarrow \mathsf{A}_{0,0} \\ \mathsf{A}_{0,0} \rightarrow \mathsf{A}_{0,0} \\ \mathsf{A}_{0,0} \rightarrow \mathsf{A}_{0,0} \\ \mathsf{A}_{0,0} \\ \mathsf{A}_{0,0} \rightarrow \mathsf{A}_{0,0} \\ \mathsf{A}$$

★ JAL  

$$S_{26}$$
 $S_{27}$ 
 $S_{28}$ 
 $S_{0} \rightarrow \boxed{IP \rightarrow Rf - D3}_{IR_{11} \rightarrow P} \rightarrow \boxed{IP \rightarrow ALU - A}_{IR_{50} \rightarrow SE10 \rightarrow ALU - B}_{IR_{50} \rightarrow SE10 \rightarrow ALU - B} \rightarrow \boxed{T_{1} \rightarrow IP}_{ALU - 0 \rightarrow T_{1}}_{ADD}$ 
  
★ JLR:



Figure 1: The Datapath (Excuse the clumsiness!)

#### Note:

- The x2 operation in states  $S_{10}$  and  $S_{16}$  is unnecessary. (It is required when memory is byte-addressable as opposed to the 16-bit addressable-memory which has been implemented)
- 'PAD7' refers to the operation of placing the given 9 bits as MSB and padding "0000000" to make it a 16-bit vector.
- SE10 and SE7 are sign extenders They are used to convert a 6-bit or a 9-bit vector into a 16-bit vector without altering it's value. It is padded with 1s or 0s depending on the first bit of the input.

The CPU consists of the following components:

- 1. ALU
- 2. Register File
- 3. Memory Unit
- 4. 10 bit sign extender
- 5. 7 bit sign extender

The CPU has signals to store contents of instruction register, instruction pointer and 5 temporary registers. Signals are also declared to map components like ALU to the CPU.

### 5 Testing

To test our machine, we stored the instruction in memory, initialised the Instruction Pointer to 0 and checked the memory and register file after each instruction to ensure that it is working as desired.