Consider a 64-bit computer with a simplified memory hierarchy.
This hierarchy contains a single cache and an unbounded backing
memory. The cache has the following characteristics:
• Direct-Mapped, Write-through, Write allocate.
• Cache blocks are 4 words each.
• The cache has 256 sets.
(a) Calculate the cache’s size in bytes.
(b) Consider the following code fragment in the C programming language to be run on the described computer. Assume that: program instructions are not stored in cache, arrays are cache-aligned (the beginning of the array aligns with the beginning of a cache line), ints are 32 bits, and all other variables are stored only in registers.
int N = 32768;
int A[N];
for (int i = 0; i < N; i += 2) {
A[i] = A[i+1];
}
Determine the following:
(i) The number of cache misses.
(ii) The cache miss rate.
(iii) The type of cache misses which occur.
(c) Consider the following code fragment in the C programming language to be run on the described computer. Assume that: program instructions are not stored in cache, arrays are cache-aligned (the beginning of the array aligns with the beginning of a cache line), ints are 32 bits, and all other variables are stored only in registers.
int N = 32768;
int A[N];
int B[N];
for (int i = 0; i < N; ++i) {
B[i] = A[i];
}
Determine the following:
(i) The number of cache misses.
(ii) The cache miss rate.
(iii) The type of cache misses which occur.




/* please give a thumbs up *, i am submitting my own answer answered earlier, comment if any doubts.
Consider a 64-bit computer with a simplified memory hierarchy. This hierarchy contains a single cache and...
Memory Hierarchy and Cache Consider a computer with byte-addressable memory. Addresses are 24-bits. The cache is capable of storing a total of 64KB of data, and frames of 32 bytes, Show the format of a 24-bit memory address for: a- Direct mapped cache b- 2-way associative cache c- 4-way associative cache d- For each type of cache above, indicate where would the reference memory address 0DEFB6 map
Consider a memory hierarchy using one of the three organization for main memory shown in a figure below. Assume that the cache block size is 32 words, That the width of organization b is 4 words, and that the number of banks in organization c is 2. If the main memory latency for a new access is 10 cycles, sending address time is 1 cycle and the transfer time is 1 cycle, What are the miss penalties for each of...
1- A 64-bit computer system employs a 16Gbyte main memory and a 32 Kilo word cache. Determine the number of bits in each field of the memory address register (MAR) as seen by cache in the following organizations (show your calculations): Fully associative mapping with line size of 2 words. A. Direct mapping with the line size of 8 words. B. C. 4-way associated mapping with the line size of 1 words.
1- A 64-bit computer system employs a 16Gbyte...
Assume you have: 32-bit addresses, 4KB Page size, 4MB Physical Memory Space, 4KB Cache with 4-way set associative and LRU replacement, 32 Byte Cache block size, 4-entry fully associative TLB. A program to be run on this machine begins as follows: double A[1024]; int i, j; double sum = 0; for( i = 0; i < 1024; i++ ) // first loop A[i] = i; for( j = 0; j < 1024; j += 16 ) // second loop ...
) Consider an 8-way associative 64 Kilo Byte cache with 32 byte cache lines. Assume memory addresses are 32 bits long. a). Show how a 32-bit address is used to access the cache (show how many bits for Tag, Index and Byte offset). b). Calculate the total number of bits needed for this cache including tag bits, valid bits and data c). Translate the following addresses (in hex) to cache set number, byte number and tag (i) B2FE3053hex (ii) FFFFA04Ehex...
Computer memory is typically organized in a hierarchy with different types of memory providing different size, speed, cost, and volatility trade-offs. Which of the following statements are true: Registers are the fastest memory accessible by machine instructions. Random Access Memory (RAM) is typically divided into regions (segments) dedicated to specific uses by executing programs. Instructions are fetched from Code segments. Temporary values used to support algorithm recursion are stored in Stack Segments. in computer programming, a static variable is a...
Assume the following about a computer with a cache: .. The memory is byte addressable. • Memory accesses are to 1-byte words (not to 4-byte words). .. Addresses are 8 bits wide. .. The cache is 2-way associative cache (E=2), with a 2-byte block size (B=2) and 4 sets (5=4). • The cache contents are as shown below (V="Valid"): Set #Way #0 Way #1 V=1;Tag=0x12; Data = v=1;Tag=0x10; Data = Ox39 0x00 0x26 Ox63 V=1;Tag=0x09; Data = v=1;Tag=0x11; Data =...
Cache performance The starting code would have: struct position { int x; int y; } int N; struct position grid[N][N]; int totalX=0; int totalY=0; int i, j; //For X loop for( i = 0; i < N; i ++){ for( j = 0; j < N; j++){ totalX += grid[i][j].x; } } //For Y loop for( j = 0; j < N; j++){ for( i = 0; i < N; i++){ totalY += grid[i][j].y; } } Part I: This part...
14 Points Bonus Question: Consider the following transpose routine: typedef int array (212): void transposel (array dst, array sre) int i, j; for(i=0; i<2; i++) { for (-0; j 2; j++){ dst sre[BG); Assume that this code runs on a machine with the following properties: • sizeof(int) = 4 • The src array starts at address and the dst array starts at address 16 There is a single L1 cache that is direct-mapped, write through, write allocate, with a block...