HW: Caches

(If you want to write up your answers using Markdown, here is a Markdown template you can use.)

    No Cache

  1. Consider the following program fragment. Assume that each instruction fetch takes 50 ns. How long would it take for this program fragment to run? You will need to read the code to know how many times the loop will be executed. (Assume there is no cache.)

    
            112:         addi $t0, $zero, 0
            116:         addi $t1, $zero, 100
            120:  loop:  add $s0, $s0, $s1
            124:         addi $t0, $t0, 1
            128:         bne $t0, $t1, loop
        
  2. Direct-Mapped Caches

  3. Consider a (very small) direct-mapped cache that has 4 lines, each containing 1 4-byte word. Assume that the cache has a hit access time of 1 ns and a miss penalty of 50 ns (i.e., the full access time is 51 ns when there is a miss: the miss penalty + the usual access time). How long would it take for the program fragment above to run, assuming that none of the instructions are in the cache to begin with? What is the average memory access time (overall time / number of instructions)? (The program fragment is repeated below for convenience.) (You can fill in the table to the right if that would be helpful. I've indicated what it would look like after the first instruction missed and then was fetched. Address 112 maps to line 00. The v column indicates whether the line contains valid (meaningful) data for this program, or just left-over garbage.)
    Filling in this table is optional
     v TagInstruction in Cache (4 bytes)
    00 1 (Yes) 0000 ... 0111 addi $t0, $zero, 0 (but in binary)
    01 0 (No)   
    100  
    110  

    
    
            112:         addi $t0, $zero, 0
            116:         addi $t1, $zero, 100
            120:  loop:  add $s0, $s0, $s1
            124:         addi $t0, $t0, 1
            128:         bne $t0, $t1, loop
        
  4. Consider a direct-mapped cache that has 8 lines, each containing 1 word. For simplicity, assume that a word is 2 bytes long, which means that addresses are also 2 bytes. Consider the following set of memory address requests. How many of the requests would be hits and how many misses, assuming that the cache is empty beforehand? (I.e., v is 0 for every line.) What would be the final contents of the cache?

     v TagData in Cache (2 bytes)
    000   
    001   
    010   
    011   
    100   
    101   
    110   
    111   

    Requested AddressData at Address (in hex) Hit/Miss?
    00000000 00000000 00 00  
    00000000 00101110 00 0F  
    00000000 00101010 C0 03  
    00000000 01011010 F0 00  
    00000000 01011110 FF FF  

  5. Assume that these 5 data requests are in a loop, and so will be repeated many times in a row. What will the hit rate converge to for this set of requests?
  6. Multi-Word Lines in Direct-Mapped Cache

  7. Now assume that we have a direct-mapped cache with 4 lines, each of which contains 4 two-byte words. In one iteration of the 5 address requests above, which requests would be hits and which would be misses, assuming that the cache is empty beforehand? What would be the final contents of the cache after that 1 iteration? (Put a "v" in any word in the cache that is valid but the information you've been given isn't enough to know what the value should be.)


     v Tag Data in Cache
    00          
    01          
    10          
    11          
    Requested AddressData at Address Hit/Miss?
    00000000 00000000 0000  
    00000000 00101110 000F  
    00000000 00101010 C003  
    00000000 01011010 F000  
    00000000 01011110 FFFF  

  8. Assuming that the 5 data requests are in a loop, what does the hit rate converge to for this set of requests?

  9. Set-Associative Caches

  10. The tables below represent an 8-line, 2-way set associative cache, a 4-line, 2-way set associative cache, and a 4-line, fully associative cache. (Fully associative means that all the lines in the cache are part of a single, large set.) Show the final contents of each cache after 1 iteration of the same 5 address requests used in the previous exercises, assuming an LRU replacement policy. What would the hit ratio converge to for each cache, assuming that the 5 address requests are in a loop, and so will be repeated many times in a row?

     v TagData in Cache
    00    
       
    01    
       
    10    
       
    11    
       

     v TagData in Cache
         
       
       
       

     v TagData in Cache
    0    
       
    1    
       

  11. Reflection

  12. Which of the exercises above illustrate the effects of spatial locality? Which exhibit the effects of temporal locality?