## Beginning Logic Design – Part 14

Hello and welcome to Part 14 of my Beginning Logic Design series! In the last episode, I added my ALU operations. For this round, I want to add implement some operators for manipulating a stack and some handling for calling subroutines. Let’s jump to it!

# My Stack System

The `stack` pointer of my cpu will keep track of the “top” of the stack. Most CPUs have a stack that grows “down”, but my CPU already has a lot of inefficiencies and I’m feeling rebellious so my stack will grow up! I current reset the stack to `0` on reset, so at the start of a program it should be ready to go.

I’ll use the first few available opcodes from my `EXTRA` operation family for my stack related functions.

```F0: push A
F1: push B
F2: push C
F3: pop A
F4: pop B
F5: pop C```

As before I’ll start by roughly mocking out this organization in my `PERFORM` state

```EXTRA: begin
case (instruction[3:0])
// Push A
0: begin

end
// Push B
1: begin

end
// Push C
2: begin

end
// Pop A
3: begin

end
// Pop B
4: begin

end
// Pop C
5: begin

end
endcase
end```

Now I’ll start on the `PUSH A` operations. I’ll need to write `A` to the memory address my `stack` pointer is currently set to, then increment the stack pointer. Since this involves some bus interactions it’ll take two cycles.

On the first I’ll put the `A` register value in the `write_data` register, set the `address_bus` to my `stack` pointer and enable `write`.

For the second cycle, I’ll clear my `write` signal, increment my `stack` and return to `FETCH` to continue my program, easy as that!

```0: begin
case (cycle)
0: begin
write_data <= a;
write <= 1;
end
1: begin
write <= 0;
stack++;
state <= FETCH;
program_counter++;
end
endcase
end```

And by the magic of copy-pasta, I extend this to my other two registers.

```// Push B
1: begin
case (cycle)
0: begin
write_data <= b;
write <= 1;
end
1: begin
write <= 0;
stack++;
state <= FETCH;
program_counter++;
end
endcase
end
// Push C
2: begin
case (cycle)
0: begin
write_data <= c;
write <= 1;
end
1: begin
write <= 0;
stack++;
state <= FETCH;
program_counter++;
end
endcase
end```

Now for the inverse operation `POP`. This means performing a read with the decremented `stack` pointer and storing that into the desired register, which will also be two cycles. On the first I’ll predecrement `stack` as I set the `address_bus` to it. On the second I’ll clear my `read`, store the returned value and go back into `FETCH`.

```// Pop A
3: begin
case (cycle)
0: begin
end
1: begin
a <= data_bus;
state <= FETCH;
program_counter++;
end
endcase
end```

I honestly didn’t think implementing push and pop would be quite so easy, everything was working well on the first attempt.  As before I’ll copy my way through to implement this for `B` and `C`.

```// Pop B
4: begin
case (cycle)
0: begin
end
1: begin
b <= data_bus;
state <= FETCH;
program_counter++;
end
endcase
end
// Pop C
5: begin
case (cycle)
0: begin
end
1: begin
c <= data_bus;
state <= FETCH;
program_counter++;
end
endcase
end```

# Subroutines

The next two instructions I want to implement are an operation that jumps into a subroutine and a paired operator that returns from that subroutine. I’ll try to keep these operations pretty simple. I’ll first stub out my opcodes.

```// Jump subroutine
6: begin
case (cycle)

endcase
end
// Return from subroutine
7: begin
case (cycle)

endcase
end```

For my JSR operation (jump to subroutine), I’ll first push my next instruction address to the top of my stack, then jump the program to the next address. This will take 4 total bus interactions so my current 2-bit `cycle` variable will not allow for this, I’ll modify my `cycle` to 3-bits so it can count to 8 and start implementing.

Pretty quickly intro drafting my implementation of this, and right after gloating how easy push/pop was to implement, I noticed this one was going to be a bit trickier! The first thing I need to do is calculate the address of the next instruction and push the most significant byte to the stack.

```0: begin
write <= 1;
program_counter += 3;
write_data <= program_counter[15:8];
end```

On the next cycle, I complete the return address right by setting the next stack byte to the least significant byte.

```1: begin
write_data <= program_counter[7:0];
end```

With the pointer written to the stack, I’ll begin reading the next pointer to jump to and increment my stack by the length of the pointer (2 bytes). Since my program counter is now ahead of the pointer to jump to, I need to look back 2 bytes for the most significant byte of the subroutine’s address.

```2: begin
write <= 0;
stack += 2;
end```

I’ll store the returned most signifcant byte for the subroutine in my `x` register and request the next byte.

```3: begin
x <= data_bus;
end```

Then finally I’ll be done with the bus and can jump into the subroutine.

```4: begin
program_counter <= {x, data_bus};
state <= FETCH;
end```

Phew! I had a few issues with implementing this at first, primarily from not managing my pointers properly. With time, patience and debugging in the simulator it did eventually work out.

The ReTurn from Subroutine (`RTS`) thankfully is a bit easier, and will only take three cycles. First I’ll begin the read for the least significant byte of where to jump back to.

```0: begin
end```

On the second cycle, I’ll store that byte in `x` and read the most significant byte of the return pointer.

```1: begin
x <= data_bus;
end```

```2: begin
program_counter <= {data_bus, x};
state <= FETCH;
end```

That’ll do it! I’ll use this program to test it, annotated with addresses and comments for brevity:

```8000: c0 de     ; Set A = 0xDE
8002: f0        ; Push A to stack
8003: f6 80 07  ; Jump into subroutine at 0x8007
8006: e0        ; Halt machine
8007: c1 20     ; Set B = 0x20
8009: c2 17     ; Set C = 0x12
800b: f7        ; Return```

In simulation it works like a charm!

With that working I am done with the initial set of goals I had for this CPU, and this series along with that! I hope some folks have found this series interesting and/or useful. If you have any improvements to suggest or would like me to cover the implementation of any of this in further detail please leave a note in the comments. Keep tinkering!!

## Beginning Logic Design – Part 13

Hello and welcome to Part 13 of my Beginning Logic Design series! In the last post I implemented my branch instructions. For this round, I want to implement my ALU operations.

# ALU Instructions and Arguments

For my ALU, I want to follow a slightly different pattern for my arguments. In the instructions implemented so far the lower 4 bits of the instruction represented a certain operation within the instruction family. For the ALU operations I’d like to use these 4 bits to instead represent the operands of the instruction.

With the 4  bits available, I’ll use 2 bits to encode each operand with the following representations:

```00 - A register
01 - B register
10 - C register
11 - Unused```

So the overall format (in binary) of these ALU instructions will be `iiiiaabb`. Where `i` represents the instruction, `a` the first encoded operand and `b` for the second.

For all of the ALU instructions, I will use the second operand to indicate where the result will be stored. The instructions `ADD`, `SUBTRACT`, `BIT_AND`, `BIT_OR` and `BIT_XOR` all use two operands, so the second operand is used in the instruction and is where the result is stored. For the remaining operations `INCREMENT`, `DECREMENT`, `BIT_NOT`, `SHIFT_LEFT`, `SHIFT_RIGHT`, `ROTATE_LEFT` and `ROTATE_RIGHT` the first operand is used in the operation and the second is where the result is to be stored.

# Wiring up the ALU

The first thing I’ll need to build instructions for the ALU, will be to actually include it in the processor!

First, near the top of my `cpu.sv` file I’ll include my ALU package.

`import ALU::*;`

Next, inside my `cpu` module, just under the other internal declarations, I’ll add signals to interface with my ALU and the ALU instance itself.

```// ALU signals and module
logic alu_clock;
opcode alu_operation;
logic [7:0] alu_a;
logic [7:0] alu_b;
logic alu_carry_in;
logic [7:0] alu_y;
logic alu_zero;
logic alu_sign;
logic alu_carry_out;
logic alu_overflow;
assign alu_clock = !clock;
alu cpu_alu (
alu_clock,
alu_operation,
alu_a,
alu_b,
alu_carry_in,
alu_y,
alu_zero,
alu_sign,
alu_carry_out,
alu_overflow
);
```

I’ve set my `alu_clock` to follow an inverted clock similar to how the system bus operates.

Next, within my `FETCH`  CPU state, I’ll add another `\$cast()` call to set my `alu_operation` to be upper four bits of my current instruction, just like I have for my `op_type` since I mapped the same values for CPU and ALU operations. There are some possible edge cases where the CPU operation will map to a number that has no meaning to the ALU, so we’ll add a sanity check to make sure it’s within the supported range.

```if (data_bus[7:4] < 15)
\$cast(alu_operation, data_bus[7:4]);```

That’ll get the basics in place for the ALU.

# Implementing ALU operations

The first instruction I want to get setup is the `ADD` instruction.

In the first cycle of `ADD`, I’ll set my ALU variables to match the registers specified in the instruction as well as passing in our current `carry` flag:

```CPU_ADD: begin
case(cycle)
0: begin
case(instruction[3:2])
0: alu_a <= a;
1: alu_a <= b;
2: alu_a <= c;
endcase
case(instruction[1:0])
0: alu_b <= a;
1: alu_b <= b;
2: alu_b <= c;
endcase
alu_carry_in <= carry;
end
endcase
end```

On the next cycle our ALU will have presented its results so we can, in a similar fashion, store the result and set the modified flags.

```1: begin
case(instruction[1:0])
0: a <= alu_y;
1: b <= alu_y;
2: c <= alu_y;
endcase
carry <= alu_carry_out;
zero <= alu_zero;
sign <= alu_sign;
overflow <= alu_overflow;
program_counter++;
state <= FETCH;
end```

Getting the `ADD` to work was just that easy, but better yet this pattern also works for `SUBTRACT`! We can just let both operations follow this same `case` statement.

```CPU_ADD, CPU_SUBTRACT: begin
case(cycle)
0: begin
case(instruction[3:2])
0: alu_a <= a;
1: alu_a <= b;
2: alu_a <= c;
endcase
case(instruction[1:0])
0: alu_b <= a;
1: alu_b <= b;
2: alu_b <= c;
endcase
alu_carry_in <= carry;
end
1: begin
case(instruction[1:0])
0: a <= alu_y;
1: b <= alu_y;
2: c <= alu_y;
endcase
carry <= alu_carry_out;
zero <= alu_zero;
sign <= alu_sign;
overflow <= alu_overflow;
program_counter++;
state <= FETCH;
end
endcase
end```

It almost supports `SHIFT_RIGHT`, `ROTATE_LEFT` and `ROTATE_RIGHT` too, as these operations should also set most of these same flags. The issue is that `ADD` and `SUBTRACT` affect the `overflow` flag, so I’ll use my powers of copy-pasta to separate those into a case that doesn’t set `overflow`, but is otherwise identical.

```CPU_SHIFT_RIGHT, CPU_ROTATE_LEFT, CPU_ROTATE_RIGHT: begin
case(cycle)
0: begin
case(instruction[3:2])
0: alu_a <= a;
1: alu_a <= b;
2: alu_a <= c;
endcase
case(instruction[1:0])
0: alu_b <= a;
1: alu_b <= b;
2: alu_b <= c;
endcase
alu_carry_in <= carry;
end
1: begin
case(instruction[1:0])
0: a <= alu_y;
1: b <= alu_y;
2: c <= alu_y;
endcase
carry <= alu_carry_out;
zero <= alu_zero;
sign <= alu_sign;
program_counter++;
state <= FETCH;
end
endcase
end```

That’s 5 of the 12 operations already. The last 7 can also be bundled into the same case statement, the only difference for them is they don’t care what the `carry` flag is set to, and they only affect the `sign` and `zero` flags.

```CPU_INCREMENT, CPU_DECREMENT, CPU_AND, CPU_OR, CPU_XOR, CPU_NOR, CPU_SHIFT_LEFT: begin
case(cycle)
0: begin
case(instruction[3:2])
0: alu_a <= a;
1: alu_a <= b;
2: alu_a <= c;
endcase
case(instruction[1:0])
0: alu_b <= a;
1: alu_b <= b;
2: alu_b <= c;
endcase
end
1: begin
case(instruction[1:0])
0: a <= alu_y;
1: b <= alu_y;
2: c <= alu_y;
endcase
zero <= alu_zero;
sign <= alu_sign;
program_counter++;
state <= FETCH;
end
endcase
end```

Huzzah! We can now utilize our ALU operations via our CPU program code. In the next post I will add some operations to my CPU to include stack functionality and operations that can be used to call subroutines. As always, I welcome your feedback and questions in the comments. Keep tinkering!

## Beginning Logic Design – Part 12

Hello and welcome to Part 12 of my Beginning Logic Design series! In the last post I implemented the `LOAD` and `STORE` sets of operations. In this round I will start to implement branching operations that allow the code to take different paths through a program.

# Branching Out

One of the most important things a CPU needs is the ability to branch, or jump, to different program code based on some conditions. Imagine an if/else statement in most common programming languages. You have some condition that you evaluate, and based on that outcome you perform one set of operations or another.

The first step I want to take towards implementing this is adding my CPU flags that will represent the conditions that can be considered.

```// CPU flags
logic zero;
logic sign;
logic overflow;
logic carry;
```

I’ll also modify my CPU’s `RESET` state to set these all to `0` on reset.

```RESET: begin
state <= FETCH;
program_counter <= 'h8000;
stack <= 0;
write <= 0;
write_data <= 0;
zero <= 0;
sign <= 0;
overflow <= 0;
carry <= 0;
end```

Next, I want to check to make sure the instructions I have defined so far are setting these flags as I’d like them to. Right now the only commands that should be modifying these flags are the `LOAD` commands, if the number loaded is `0`, then the `zero` flag should be set. If the number loaded could be interpreted  as a negative number (it’s highest bit is `1`), the `sign` flag should get set.

This is easily implement by adding this to the final cycle of each load command, right near where the register loaded is being set.

```if (data_bus == 0)
zero <= 1;
else
zero <= 0;
sign <= data_bus;```

I’ll write a test program to load `A` with `0`, then load `B` it with `ff` (-1).

```c0 00
c1 ff
```

In simulation, it looks good! Now for the `BRANCH` operations themselves! For now I have 10 Branch operations I’d like to define:

```0 - Halt
1 - Jump
2 - Branch if zero set
3 - Branch if zero unset
4 - Branch if sign set
5 - Branch if sign unset
6 - Branch if overflow set
7 - Branch if overflow unset
8 - Branch if carry set
9 - Branch if carry unset```

These will be fairly quick to implement, as all of them are quite similar. First I will setup my overall `case` statement structure.

```BRANCH: begin
case (instruction[3:0])
// Halt
0: begin
end
// Jump
1: begin
end
// Branch zero set
2: begin
end
// Branch zero clear
3: begin
end
// Branch sign set
4: begin
end
// Branch sign clear
5: begin
end
// Branch overflow set
6: begin
end
// Branch overflow clear
7: begin
end
// Branch carry set
8: begin
end
// Branch carry clear
9: begin
end
endcase
end```

The `HALT` operation is dead simple, just change the CPU state to `HALT`

```// Halt
0: begin
state <= HALT;
end```

I’ll give this operation a test shortly, first I want to implement my `JUMP` operation. The implementation of that begins pretty similarly to the other operations that look for a memory address, there will be 3 cycles. The first address byte will be requested; on the second cycle the first byte read and the second byte requested; on the last cycle the reading will stop and the `program_counter` will be set to its new value.

```// Jump
1: begin
case (cycle)
0: begin
end
1: begin
x <= data_bus;
end
2: begin
program_counter <= {x,data_bus};
state <= FETCH;
end
endcase
end```

Now, as a test, I’ll extend my last program that set the flags to include a `JUMP` call, after that instruction I’ll pad a few bytes with `FF` and at `8010` I’ll have my `HALT` instruction.

```c0 00
c1 ff
e1 80 10
ff ff ff ff ff ff ff ff ff
e0```

Testing it in the simulator it works like a charm! ## Conditional Branches

The conditional branches are fairly simple to implement, I just took my `JUMP` implementation and added a condition on the flag during the first cycle. If the condition is not met, we can modify the program counter to start the fetch of the next instruction

```// Branch zero set
2: begin
case (cycle)
0: begin
if (zero) begin
end else begin
program_counter += 3;
state <= FETCH;
end
end
1: begin
x <= data_bus;
end
2: begin
program_counter <= {x,data_bus};
state <= FETCH;
end
endcase
end```

This is basically the same for the `Branch zero clear` operation, the only change is a flipping of the `if`/`else` statements.

```// Branch zero clear
3: begin
case (cycle)
0: begin
if (zero) begin
program_counter += 3;
state <= FETCH;
end else begin
end
end
1: begin
x <= data_bus;
end
2: begin
program_counter <= {x,data_bus};
state <= FETCH;
end
endcase
end```

The remaining operations are a very slight derivation of these two, some copy-pasta will do the trick and the only change is what flag is being looked at in the `if` condition.

```// Branch sign set
4: begin
case (cycle)
0: begin
if (sign) begin
end else begin
program_counter += 3;
state <= FETCH;
end
end
1: begin
x <= data_bus;
end
2: begin
program_counter <= {x,data_bus};
state <= FETCH;
end
endcase
end
// Branch sign clear
5: begin
case (cycle)
0: begin
if (sign) begin
program_counter += 3;
state <= FETCH;
end else begin
end
end
1: begin
x <= data_bus;
end
2: begin
program_counter <= {x,data_bus};
state <= FETCH;
end
endcase
end
// Branch overflow set
6: begin
case (cycle)
0: begin
if (overflow) begin
end else begin
program_counter += 3;
state <= FETCH;
end
end
1: begin
x <= data_bus;
end
2: begin
program_counter <= {x,data_bus};
state <= FETCH;
end
endcase
end
// Branch overflow clear
7: begin
case (cycle)
0: begin
if (overflow) begin
program_counter += 3;
state <= FETCH;
end else begin
end
end
1: begin
x <= data_bus;
end
2: begin
program_counter <= {x,data_bus};
state <= FETCH;
end
endcase
end
// Branch carry set
8: begin
case (cycle)
0: begin
if (carry) begin
end else begin
program_counter += 3;
state <= FETCH;
end
end
1: begin
x <= data_bus;
end
2: begin
program_counter <= {x,data_bus};
state <= FETCH;
end
endcase
end
// Branch carry clear
9: begin
case (cycle)
0: begin
if (carry) begin
program_counter += 3;
state <= FETCH;
end else begin
end
end
1: begin