Question

1. (10 points) Suppose you have a load-store computer with the following instruction mix Operation Frequency Number of clock cycles ALU ops Loads Stores Branches 40 % 20 % 18% 22 % 4 4 The ALU ops (arithmetic logic unit ops) typically use operands in CPU registers and hence they take fewer clock cycles to execute. However, if you want to add a memory operand to a CPU register, then you would have to explicitly load it into a CPU register. For such ALU operations, you would say that they are paired with a load instruction since the value moved from memory would be used only for the particular ALU operation and not used anywhere else. We observe that 30% of the ALU ops are paired with a load (ie., they occur together), and we propose to replace these ALU ops and their loads with a new instruction. Assume that this new instruction takes 4 clock cycles. However, with the new instruction added, branches will take 8 clock cycles rather than 6. Assuming that the clock rate is unchanged, would this change improve performance? Justify your answer quantitatively by comparing expressions for CPU Execution time, Show all vour work

0 0
Add a comment Improve this question Transcribed image text
Answer #1

First let's find out the CPU execution time for the given scenario let's call it CPU1:-

average number of cycles per instruction = total number of cycles / total number of instruction

= (0.4 * 2 + 0.2 * 4 + 0.18 * 4 + 0.22 * 6 )/1

=0.8 + 0.8 +0.72 + 1.32

=3.64 cycles/instruction

CPU1 execution time = number of instructions * average number of cycles/instruction * cycle time

= 1*3.64 * t (cycle time = t)

= 3.64t

now,let the case of CPU2:-

here,30% of the ALU ops are being replaced by new kind of instruction which takes 4 clock cycles.let's call these new instruction as ALU2 ops.

now overall percentage of this new instruction in instruction set = 30% of 40

= (30 * 40) /100 = 1200/100 = 12% in overall

so,remaining 28% is ALU ops instruction which still takes 2 cycles.

now average number of cycles per instruction = (0.28 * 2 + 0.12 * 4 + 0.2 * 4 + 0.18 * 4 + 0.22 * 8)/1

=3.1072 cycles/instructions
CPU2 execution time = number of instructions * average number of cycles/instruction * cycle time

= 1* 3.1072 * t

=3.1072 t    (cycle time doesn't change)

clearly we can see that CPU2 takes less time than CPU1 so,second approach is more faster than first approach.

Add a comment
Know the answer?
Add Answer to:
1. (10 points) Suppose you have a load-store computer with the following instruction mix Operation Frequency...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • 1. [20 Pts] You are given the data-path below. Note that there are three registers. Two of these have a load control input, while the other loads a new value on every clock cycle. There are two t...

    1. [20 Pts] You are given the data-path below. Note that there are three registers. Two of these have a load control input, while the other loads a new value on every clock cycle. There are two tri-state drivers that connect the outputs of registers A and Bto a common bus. Finally, there is an ALU that can perform two operations: .Pass Y add X and Y X is always the output of register C, while Y is the value...

  • Please answer the list questions above with explanation. Thank you LOAD-STORE PROGRAM EXAMPLE Write an Assembly...

    Please answer the list questions above with explanation. Thank you LOAD-STORE PROGRAM EXAMPLE Write an Assembly program to add two 8-bit numbers. C A+B lds r16, A lds rl7, B : 1. Load variables add E16, :172 Do something 2. Do something sts C, r16 : 3. Store answer Identify the operation, source operand, destination operand in the first Data Transfer insiruction. Identify the source/destination operand in the Arithmetic and Logic (ALU) instruction. .What addressing mode is used by the...

  • A particular (fictional) CPU has the following internal units and timings: 1. IFD: Instruction fetch + decode : 160 ps...

    A particular (fictional) CPU has the following internal units and timings: 1. IFD: Instruction fetch + decode : 160 ps 2. RR: Register read 80 ps 3. ALU: 240 ps 4. MA : memory access: 160 ps (assuming cache) 5. RW : register write : 80 ps There are 5 basic instruction types: 1. LOAD : IFD+RR+ALU+MA+RW 720 ps 2. STORE: IFD+RR+ALU+MA : 640 ps 3. ARITHMETIC: IFD+RR+ALU+RW : 560 4. BRANCH: IFD+RR+ALU : 480 ps 5. MEMOP: IFD+RR+MA+ALU+MA :...

  • Given a processor that runs at 1GHz with the following: Instruction-------------- Frequency --------------Cycles Load & store ----------------25% --------------------10 arithmetic instructions----...

    Given a processor that runs at 1GHz with the following: Instruction-------------- Frequency --------------Cycles Load & store ----------------25% --------------------10 arithmetic instructions------ 65% --------------------6 branch instructions -----------10%-------------------- 4 1) Calculate the CPI for the above. 2) Suppose the amount of registers are doubled, such that clock cycle time increases by 40%. What is the new clock speed (in GHz)? 3) Assume only the load & stores instructions are speed up by 5 times and their frequency is increased to 50% (Arithmetic instructions...

  • 26. The is a group of bits that tells the computer to perform a specific operation...

    26. The is a group of bits that tells the computer to perform a specific operation A). program counter B). Opcode C). register D). microoperation 27. A condition called occurs in unsigned binary representation of a number when the result of an arithmetic operation is outside the range of allowable precision for the given number of bits. A). underflow B). 2's complement C). overflow D) bitwise complement 28. An iteration of the fetch-decode-execute cycle includes which of the following events?...

  • We found that the instruction fetch and memory stages are the critical path of our 5-stage...

    We found that the instruction fetch and memory stages are the critical path of our 5-stage pipelined MIPS CPU. Therefore, we changed the IF and MEM stages to take two cycles while increasing the clock rate. You can assume that the register file is written at the falling edge of the clock. Assume that no pipelining optimizations have been made, and that branch comparisons are made by the ALU. Here’s how our pipeline looks when executing two add instructions: Clock...

  • 1. Introduced by IBM with its System/360, the _________ is a set of computers offered with...

    1. Introduced by IBM with its System/360, the _________ is a set of computers offered with different price and performance characteristics that presents the same architecture to the user. 2. A large number of general-purpose registers, and/or the use of compiler technology to optimize register usage, a limited and simple instruction set, and an emphasis on optimizing the instruction pipeline are all key elements of _________ architectures. 3. The difference between the operations provided in high-level languages (HLLs) and those...

  • Exercise 8.16 You are building a computer with a hierarchical memory systenm that consists of separate...

    Exercise 8.16 You are building a computer with a hierarchical memory systenm that consists of separate instruction and data caches followed by main memory. You are using the ARM multicycle processor from Figure 7.30 running at 1 GHz (a) Suppose the instruction cache is perfect (i.e., always hits) but the data cache has a 5% miss rate. On a cache miss, the processor stalls for 60 ns to access main memory, then resumes normal operation. Taking cache misses into account,...

  • Question 21 Suppose we have the instruction Load 800. Given register R1 has the value 300...

    Question 21 Suppose we have the instruction Load 800. Given register R1 has the value 300 and memory as follows: Memory 800 900 900 1000 1000 500 1100 600 1200 800 What would be loaded into the AC if the addressing mode for the operation is indexed relative to R1?

  • Topics 1. MIPS instruction set architecture (ISA). 2. Performance. 3. MIPS datapath and control. Exercise 1...

    Topics 1. MIPS instruction set architecture (ISA). 2. Performance. 3. MIPS datapath and control. Exercise 1 Consider the memory and register contents shown below. Registers Ox0100 FFF8 13 ($t 5) 14 ($t6) 0x0100 FFFC 0x0101 0000 Memory 0x0000 0000 0x0001 1100 0x0A00 со00 0x1234 4321 OxBAOO OOBB 15 OXAAAA 0000 0x1111 1010 0x7FFF FFFD 0x0100 FFFO 0x0101 0008 (St7) Ox0101 0004 16 ($80) 0x0101 0008 17 ($sl) Show what changes and give the new values in hexadecimal after the following...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT