The processor is designed with a nine-stage pipeline. The main execution stage may be somewhat unusual. Because the x86 architecture has instructions that can do calculations and multiple memory accesses with a single instruction, it would take multiple cycles to execute these with a classic RISC pipeline. A dedicated unit for effective-address calculation can be added to improve the efficiency of a RISC pipeline. The current ‘RMXW’ pipeline is chosen in order to use this extra unit even more efficiently and execute much more instructions in a single pipeline cycle. The choice for this pipeline does result in extra hazards when instructions try to read memory before the same address is written to by a previous instruction. However these hazards can be resolved quite easily. A description of the pipeline is shown below.
Nine-stage Pipeline| |
--------|-------|----------
F |Fetch |Fetch 8 words from memory
L |Length decode |Decode instruction length
D |Decode |Buffer fetched bytes and decode instructions
S |Sequencer |Get the microcode address from ROM
I |Issue |Buffer decoded instructions and get the microcode from ROM
R |Read |Read from register file
M |Memory read |Read from memory
X |Execute |Execute ALU, mult, div, shift, ...
W |Write back |Write to register file or memory
The decode and issue stages contain buffers to respectively store fetched bytes and decoded instructions. For now there is no bypassing in these buffers resulting in a longer latency after calls and jumps. This should be fixed in a future release.