HW: Single-cycle, Multi-cycle, and Pipelined Datapaths

In a single-cycle datapath, instructions are executed one at a time, one instruction in each clock "tick". In other words, the clock cycle time is set to the time it takes any instruction to fully execute.

In both a multi-cycle datapath and a pipelined datapath, a clock "tick" is the time an instruction spends in any given stage of the datapath. Each stage has a latency time associated with it, which is the amount of time required for the stage to do its job. An example set of stages, with latencies, might be:

In a multi-cycle datapath, instructions are executed one at a time, but only go through the stages that they need to. (E.g., R-format instructions do not have to go through MEM, because they don't read or write to memory.) In a pipelined architecture, as each instruction moves to the next stage, the following instruction moves into the stage behind it.

Based on the information above, complete the following exercises:

  1. A stage's latency describes the minimum amount of time an instruction would have to spend in that stage, but, depending on the datapath type, an instruction might spend longer in a stage than its latency requires. In which datapath(s) does each instruction spend the minimum amount of time in each stage? In which datapath(s) does each instruction spend the same amount of time in each stage, regardless of the stage's latency?
    3 pts
  2. Given the latencies described above, what would be the clock cycle time for a single-cycle datapath? For a multi-cycle datapath? For a pipelined datapath?
    3 pts
  3. Which two datapaths would have the same clock cycle?
    1 pt
  4. How long would a sw instruction take in each of these three cases?
    3 pts
  5. How long would a beq instruction take in each of these three cases?
    3 pts
  6. In which datapath(s) do all instructions take the same amount of time?
    1 pt
  7. Does pipelining improve the execution time of individual instructions, improve throughput, or both? (What is throughput?)
    1 pt
  8. In a pipelined datapath, why does it make sense to require all instructions to go through all the stages of the datapath, whether they are necessary or not? (E.g., R-format instructions have to go through MEM, even though they don't read or write to memory.)
    3 pts
  9. Which datapath(s) would become faster if the latency of one of the shorter stages became shorter? Would that speed up all instructions, or only some instructions?
    3 pts
  10. If we could split one stage of the pipelined datapath into two new stages, each with half the latency of the original stage, which stage would you split and what would be the new clock cycle time of the processor?
    2 pts
  11. In calculating the throughput for programs, for which datapath(s) would it be useful to know what percentage of instructions are typical R-format instructions, what percentage are branches, and what percentage are lw or sw instructions?
    3 pts
  12. Assume that the execution of a large program involves the following percentages of different types of instructions: Assuming there are no stalls or hazards, how long should a single-cycle datapath processor with the latencies specified above take to execute 10000 instructions? What about a multi-cycle datapath processor? What about a pipelined processor?
    3 pts
  13. Would you expect actual empirical test results to match your calculations? Would you expect all three architecture types (single-cycle, multi-cycle, pipelined) to be equally similar or different, or would you expect bigger differences with some architectures than others? Why or why not?
    3 pts