For this lab I designed an explicitly paralell instruction set, wrote a CPU design in VHDL to implement it, and then wrote a simple compiler to convert serial code to run efficiently on this processor.
An explicitly parallel instruction set makes sense because while all modern CPUs excecute multiple instructions at once, standard superscalar ones have to still present a serial interface to the outside world. This requires the processor to waste a lot of circutry on dependency checking and re-ordering instructions. With and explicitly parallel system we present a parallel interface and shift the burden of converting the serial code to parallel code over to the compiler or the programmer. This makes a lot of sense for several reasons:
|
|
The first register is always a source, the second always a
destination. The second may also be a source, for instructions
such as and
which take two operands. Because
instructions are of fixed size with their components all of
fixed size, decoding can be very fast.
The full instruction set is:
0000
| nop
| No operation performed |
0001
| mov
| reg 2 gets reg 1 if pred
|
0010
| add
| reg 2 gets reg 1 +
reg 2 if pred
|
0011
| and
| reg 2 gets reg 1 &
reg 2 if pred
|
0100
| ldd
| register A gets val
if pred
|
0101
| jmp
| jump to line if pred
|
1000
| inv
| reg 2 gets the bitwise inverse of
reg 1 if pred
|
1001
| sio
| Output reg 1 to the display if
pred
|
1010
| lio
| Input reg 2 from the DIP switches if
pred
|
1011
| rsh
| reg 2 gets a rightshited reg
1 if pred
|
1100
| lsh
| reg 2 gets a leftshifted reg
1 if pred
|
1101
| rsa
| reg 2 gets an arithmatically rightshifted
reg 1 if pred
|
1111
| isz
| pred gets (reg 1 ==
0) if pred
|
Dependencies between the instructions are not dealt with by the CPU and must be dealt with by the programmer or the assembler. The minimum amount of spacing needed between instructions is as follows:
Instruction Type | RAW | WAR | WAW |
---|---|---|---|
Register | 3 | -2 | 1 |
IO | 1 | 1 | 1 |
Predicate | 2 | -1 | 1 |
Jump | 4 | -3 | 1 |
The processor takes the instructions three at a time, so each line of instruction memory should contain 3 fifteen-bit instructions.
The structure of the CPU is:Fetch
The next instruction is brought down from memory into the instruction register (IR).Evaluate Operands
For each instruction of the three in the IR, the operands are fetched. The processor will always look up the two possible register operands, even when only one or even none are needed.Excecute
The processor now looks at the opcode for the first time and using the opcode and the decoded register values from the previous step figures out what register the instruction is supposed to set and what to set it to. The predicate also gets evaluated here.Commit
For instructions with predicates that evaluated to1
, the processor sets the given register to the given value. For ones with a0
predicate it does nothing.
OCU | : | Operation Commit Unit | |
EU | : | Excecution Unit | |
OFU | : | Operand Fetch Unit | |
IR | : | Instruction Register | |
PC | : | Program Counter | |
DISPLAY | : | A 2-digit signed hexadecimal LED | |
DIP | : | An 8-bit input |
mif
file in the format expected by the
VHDL
compiler. It also resolves dependencies
between these instructions by spacing them so they will not
conflict. It does not put the instructions in out-of-order,
though that is a feature that I would add if I had more time, as
it looks like it could add major speedups.
I had initially planned to try and take the Ulimate RISC
instruction set and make it an EPIC one, but Ultimate RISC doesn't
really make for Ultimate EPIC. I figured that I would need to
warp it so much to get it remotely efficient that I had better
just start over with an EPIC set.