In the heirarchical model the memory consists of separate instruction and data caches along with main memory. This ARM processor is running at 1GHz.
a) Given,
The instruction cache is always hit where as the data cache is 5% miss.
The processor takes 60ns to access main memory
Average memory access time is given as;
T = 1 + 0.05 * 60
T = 4 ns
b) For the non - ideal memory system , the average CPI required to store and load word instruction is;
CPI = ( 5 * 25 + 4 * 10 ) / 100
CPI = 1.65
4 cycles for store instructions and 5 cycles for load instructions.
c) The average CPI for the benchmarkis,
ACPI = (5*25 + 4*10 + 3*11 + 3*2 + 4*52)/100
ACPI = 4.12
d) Now for 7 % miss rate the average CPI is given as,
ACPI = 4.12 + 2.45
ACPI = 6.57
( 25+10 = 35 are load and store instructions * 0.07 = 2.45 )
Exercise 8.16 You are building a computer with a hierarchical memory systenm that consists of separate...
Consider a processor with a CPI of 1.5, excluding memory stalls. The instruction cache has a miss rate of 1.5%, whereas the miss rate of the data cache is 3.5%. The miss penalty of the data cache is 80 cycles. The percentage of load/store instructions within the running programs is 25%. If the CPI of the whole system, including memory stalls, is 2.5, calculate the miss penalty of the instruction cache. Miss penalty of the instruction cache- Cycles.
4B, 20%) compare performance of a Processor with cache vs. without cache. Assume an Ideal processor with 1 cycle memory access, CPI1 Assume main memory access time of 8 cycles Assume 40% instructions require memory data access Assume cache access time of I cycle Assume hit rate 0.90 for instructiens, 0.80 for data Assume miss penalty (time to read memory inte cache and from cache to Processor with cache processor) is 10 cycles >Compare execution times of 100-thousand instructions:
4B,...
6. Memory Access Time [15 points] Consider a MIPS processor that includes a cache, a main memory, and a hard drive. Access times of cache memory, main memory, and hard drive are 5 ns, 200 ns, and 1000 ns, respectively. Assume that cache memory is divided into instruction cache and data cache. Assume that data cache has a 90% hit rate. Assume that main memory has a 98% hit rate and hard drive is perfect (it has a 100% hit...
Suppose you have a machine with separate I- and D- caches. The miss rate on the I-cache is 2.6% , and on the D-cache 3.8%. On an I-cache hit, the value can be read in the same cycle the data is requesfed. On a D-cache hit, one additional cycle is required to read the value. The miss penalty is 100 cycles for data cache, 150 for I-cache. 40% of the instructions on this RISC machine are LW or SW instructions,...
Consider a memory hierarchy using one of the three organization for main memory shown in a figure below. Assume that the cache block size is 32 words, That the width of organization b is 4 words, and that the number of banks in organization c is 2. If the main memory latency for a new access is 10 cycles, sending address time is 1 cycle and the transfer time is 1 cycle, What are the miss penalties for each of...
Q4. CISC/RISC and Cache Memory (24pts) Q4-1. Assume that UltraSpark-like processor has an L1 cache with the following specifications: 40-bit wide address and 64-bit wide data busses On-chip instruction cache Cache is 16K bytes, organized as a 2-way set associative Cache line (block) size = 64 bytes 200 MHz clock frequency Average cache hit rate = 90% Instructions located in cache execute in 1 clock cycle Instructions that are not found in the on-chip cache will cause the processor to...
virtual memory support into our baseline 5-stage MIPS pipeline using the TLB miss handler. Assume that accessing the TLB does not incur an extra cycle in memory access in case of hits. Without virtual memory support (i.e. she had only a single address space for the entire system, or a physical address is same as a logical address), the average cycles per instruction (CPI) was 2 to run Program X. If the TLB misses 10 times for instructions and 20...
2. Cache hierarchy You are building a computer system with in-order execution that runs at 1 GHz and has a CPI of 1, with no memory accesses. The memory system is a split L1 cache. Both the I-cache and the D-cache are direct mapped and hold 32 KB each, with a block size of 64 bytes. The memory system is split L1 cache. Both the I-cache and the D-cache are direct mapped and hold 32 KB each, with a block...
Table 1: Load 26% Compare 14% Shift left and shift right 4% Store 9% Load immediate 4% AND 3% Add 14% Conditional branch 17% OR 5% Sub 0% Jump 1% Other register-register instructions (XOR, NOT, etc.) 1% Multiply 0% Call 1% Divide 0% Return 1% Using the data in Table 1, which of the following two enhancements will result in faster execution of the five benchmark programs that are described by the instruction frequency data? Assume that the computer used...
Computer Architecture
The format of this document is as follows: First, I give
a practice problem for which the solution is also provided. In bold
italic font, I slightly modify the problem for your
homework.
3) The 4-Stage Pipeline below suffers from the memory access
resource conflict as shown below (instruction i and i+2 want to
access memory at the same time and i+2 needs to be denied, so it
waits for the next cycle; in the next cycle it...