
- •V. Ya. Krakovsky, m. B. Fesenko
- •In Computer Systems and Networks
- •Contents
- •Preface
- •Introduction
- •Module I. Basic Components of Digital Computers
- •1. The Structure of a Digital Computer
- •1.1. Introduction to Digital Computers
- •Questions for Self-Testing
- •1.2. The Computer Work Stages Implementation Sequence
- •Questions for Self-Testing
- •1.3. Register Gating and Timing of Data Transfers
- •Questions for Self-Testing
- •1.4. Computer Interface Organization
- •Questions for Self-Testing
- •1.5. Computer Control Organization
- •Questions for Self-Testing
- •1.6. Function and Construction of Computer Memory
- •Questions for Self-Testing
- •1.7. Architecturally-Structural Memory Organization Features
- •Questions for Self-Testing
- •2. Data processing fundamentals in digital computers
- •2.1. Element Base Development Influence on Data Processing
- •Questions for Self-Testing
- •2.2. Computer Arithmetic
- •Questions for Self-Testing
- •2.3. Operands Multiplication Operation
- •Questions for Self-Testing
- •2.4. Integer Division
- •Questions for Self-Testing
- •2.5. Floating-Point Numbers and Operations
- •Questions for Self-Testing
- •Questions for Self-Testing on Module I
- •Problems for Self-Testing on Module I
- •Module II. Digital computer organization
- •3. Processors, Memory, and the Evolution System of Instructions
- •3.1. Cisc and risc Microprocessors
- •Questions for Self-Testing
- •3.2. Pipelining
- •Questions for Self-Testing
- •3.3. Interrupts
- •Questions for Self-Testing
- •3.4. Superscalar Processing
- •Questions for Self-Testing
- •3.5. Designing Instruction Formats
- •Questions for Self-Testing
- •3.6. Building a Stack Frame
- •Questions for Self-Testing
- •4. The Structures of Digital Computers
- •4.1. Microprocessors, Microcontrollers, and Systems
- •Questions for Self-Testing
- •4.2. Stack Computers
- •Questions for Self-Testing
- •Questions for Self-Testing
- •4.4. Features of Organization Structure of the Pentium Processors
- •Questions for Self-Testing
- •4.5. Computers Systems on a Chip
- •Multicore Microprocessors.
- •Questions for Self-Testing
- •4.6. Principles of Constructing Reconfigurable Computing Systems
- •Questions for Self-Testing
- •4.7. Types of Digital Computers
- •Questions for Self-Testing
- •Questions for Self-Testing on Module II
- •Problems for Self-Testing on Module II
- •Module III. Parallelism and Scalability
- •5. Super Scalar Processors
- •5.1. The sparc Architecture
- •Questions for Self-Testing
- •5.2. Sparc Addressing Modes and Instruction Set
- •Questions for Self-Testing
- •5.3. Floating-Point on the sparc
- •Questions for Self-Testing
- •5.4. The sparc Computers Family
- •Questions for Self-Testing
- •6. Cluster Superscalar Processors
- •6.1. The Power Architecture
- •Questions for Self-Testing
- •6.2. Multithreading
- •Questions for Self-Testing
- •6.3. Power Microprocessors
- •Questions for Self-Testing
- •6.4. Microarchitecture Level Power-Performance Fundamentals
- •Questions for Self-Testing
- •6.5. The Design Space of Register Renaming Techniques
- •Questions for Self-Testing
- •Questions for Self-Testing on Module III
- •Problems for Self-Testing on Module III
- •Module IV. Explicitly Parallel Instruction Computing
- •7. The itanium processors
- •7.1. Parallel Instruction Computing and Instruction Level Parallelism
- •Questions for Self-Testing
- •7.2. Predication
- •Questions for Self-Testing
- •Questions for Self-Testing
- •7.4. The Itanium Processor Microarchitecture
- •Questions for Self-Testing
- •7.5. Deep Pipelining (10 stages)
- •Questions for Self-Testing
- •7.6. Efficient Instruction and Operand Delivery
- •Instruction bundles capable of full-bandwidth dispersal
- •Questions for Self-Testing
- •7.7. High ilp Execution Core
- •Questions for Self-Testing
- •7.8. The Itanium Organization
- •Implementation of cache hints
- •Questions for Self-Testing
- •7.9. Instruction-Level Parallelism
- •Questions for Self-Testing
- •7.10. Global Code Scheduler and Register Allocation
- •Questions for Self-Testing
- •8. Digital computers on the basic of vliw
- •Questions for Self-Testing
- •8.2. Synthesis of Parallelism and Scalability
- •Questions for Self-Testing
- •8.3. The majc Architecture
- •Questions for Self-Testing
- •8.4. Scit – Ukrainian Supercomputer Project
- •Questions for Self-Testing
- •8.5. Components of Cluster Supercomputer Architecture
- •Questions for Self-Testing
- •Questions for Self-Testing on Module IV
- •Problems for Self-Testing on Module IV
- •Conclusion
- •List of literature
- •Index and Used Abbreviations
- •03680. Київ-680, проспект Космонавта Комарова, 1.
Questions for Self-Testing
1. What registers are used to interact between the CPU and main memory?
2. What is the difference between a synchronous and an asynchronous transfer mechanisms?
3. How are addition and subtraction performed in a computer with one bus?
4. Why does the multiplication and division time exceed the time consumed by other instructions?
5. How is data interchange between the CPU and main memory organized?
1.3. Register Gating and Timing of Data Transfers
L
et
us consider the case where each bit of the registers in Figs 1.3 and
1.4 consists of a simple latch shown in Fig. 1.5. The shown storage
element is assumed to be one of the bits of register Z. While the
control input Zin
is equal to 1, the state of the latch changes to correspond to the
data on the bus. Following a 1 to 0 transition at the Zin
input, the data stored in the latch immediately before this
transition is locked in until Zin
is again set to 1. Thus the two input gates of the latch implement
the function of the input control switch in Fig. 1.4.
When a given switch is in the ON state, it transfers the contents of its corresponding register to the bus. When it is in the OFF state, it is electrically disconnected from the bus. That is, it does not put the bus in any specific state, thus allowing another register to place data on the bus. Hence the output of the register-switch combination can be in one of the three states 1, 0, or open circuit. That is, it is capable of being electrically disconnected from the bus. It is also able to place either a 0 or a 1 on the bus when needed. Because it supports these three possibilities, such a gate is said to have a three-state output. A separate control input is used to either enable the gate output or to put it in a high-impedance (electrically disconnected) state. The latter corresponds to the open-circuit state of a mechanical switch.
A
n
alternative design for the common bus of Fig. 1.4 that does not
require the output switches shown makes use of open-collector (for
bipolar) or open-drain (for MOS) gates. The output of such a gate is
equivalent to a switch to ground. The switch is open when the gate
output is in the 1 state and closed when it is in the 0 state. The
structure of an open-collector bus is represented symbolically in
Fig. 1.6.
When idle, the bus is maintained in the 1 state by the "pull-up" resistor shown. Thus, as long as all gate output switches are open, that is, all outputs are in the 1 state, the bus remains in the 1 state. If any gate output changes to the 0 state, the corresponding output switch is closed, and the bus is "pulled down" to the 0 state. In other words, the bus performs an AND function on all gate outputs connected to it. Sometimes, this is referred to as a "wired-AND" connection. If this gating arrangement is used, an open-collector NAND gate as shown may replace the three-state output gate of Fig. 1.5. When Zout is high (1), the bit stored in the latch is fed to the bus. When Zout is low (0), the bus is left in the 1, or idle, state, allowing data from another register to be transferred to the bus. In general, the three-state design enables faster data transfers in comparison with the open-collector, or open-drain, approach. For this reason, it is much more commonly used in bus design. The main distinguishing feature of an open-collector bus is it’s wired-AND capability. Hence, the open-collector arrangement is used primarily for bus lines where this capability is needed.
Let us now discuss some aspects of the timing of data transfers inside the CPU. From the time the signal R2out is set to 1, a finite delay is encountered for the gate to open and then for the data to travel along the bus to the input of the ALU. The ALU adder circuits introduce further delay. For the result to be properly stored in register Z, data should be maintained on the bus for an additional period of time equal to the setup and hold times for this register. This situation is depicted in the timing diagram given in Fig. 1.7. The sum of the five delay times shown defines the minimum duration of the signal R2out.