
- •V. Ya. Krakovsky, m. B. Fesenko
- •In Computer Systems and Networks
- •Contents
- •Preface
- •Introduction
- •Module I. Basic Components of Digital Computers
- •1. The Structure of a Digital Computer
- •1.1. Introduction to Digital Computers
- •Questions for Self-Testing
- •1.2. The Computer Work Stages Implementation Sequence
- •Questions for Self-Testing
- •1.3. Register Gating and Timing of Data Transfers
- •Questions for Self-Testing
- •1.4. Computer Interface Organization
- •Questions for Self-Testing
- •1.5. Computer Control Organization
- •Questions for Self-Testing
- •1.6. Function and Construction of Computer Memory
- •Questions for Self-Testing
- •1.7. Architecturally-Structural Memory Organization Features
- •Questions for Self-Testing
- •2. Data processing fundamentals in digital computers
- •2.1. Element Base Development Influence on Data Processing
- •Questions for Self-Testing
- •2.2. Computer Arithmetic
- •Questions for Self-Testing
- •2.3. Operands Multiplication Operation
- •Questions for Self-Testing
- •2.4. Integer Division
- •Questions for Self-Testing
- •2.5. Floating-Point Numbers and Operations
- •Questions for Self-Testing
- •Questions for Self-Testing on Module I
- •Problems for Self-Testing on Module I
- •Module II. Digital computer organization
- •3. Processors, Memory, and the Evolution System of Instructions
- •3.1. Cisc and risc Microprocessors
- •Questions for Self-Testing
- •3.2. Pipelining
- •Questions for Self-Testing
- •3.3. Interrupts
- •Questions for Self-Testing
- •3.4. Superscalar Processing
- •Questions for Self-Testing
- •3.5. Designing Instruction Formats
- •Questions for Self-Testing
- •3.6. Building a Stack Frame
- •Questions for Self-Testing
- •4. The Structures of Digital Computers
- •4.1. Microprocessors, Microcontrollers, and Systems
- •Questions for Self-Testing
- •4.2. Stack Computers
- •Questions for Self-Testing
- •Questions for Self-Testing
- •4.4. Features of Organization Structure of the Pentium Processors
- •Questions for Self-Testing
- •4.5. Computers Systems on a Chip
- •Multicore Microprocessors.
- •Questions for Self-Testing
- •4.6. Principles of Constructing Reconfigurable Computing Systems
- •Questions for Self-Testing
- •4.7. Types of Digital Computers
- •Questions for Self-Testing
- •Questions for Self-Testing on Module II
- •Problems for Self-Testing on Module II
- •Module III. Parallelism and Scalability
- •5. Super Scalar Processors
- •5.1. The sparc Architecture
- •Questions for Self-Testing
- •5.2. Sparc Addressing Modes and Instruction Set
- •Questions for Self-Testing
- •5.3. Floating-Point on the sparc
- •Questions for Self-Testing
- •5.4. The sparc Computers Family
- •Questions for Self-Testing
- •6. Cluster Superscalar Processors
- •6.1. The Power Architecture
- •Questions for Self-Testing
- •6.2. Multithreading
- •Questions for Self-Testing
- •6.3. Power Microprocessors
- •Questions for Self-Testing
- •6.4. Microarchitecture Level Power-Performance Fundamentals
- •Questions for Self-Testing
- •6.5. The Design Space of Register Renaming Techniques
- •Questions for Self-Testing
- •Questions for Self-Testing on Module III
- •Problems for Self-Testing on Module III
- •Module IV. Explicitly Parallel Instruction Computing
- •7. The itanium processors
- •7.1. Parallel Instruction Computing and Instruction Level Parallelism
- •Questions for Self-Testing
- •7.2. Predication
- •Questions for Self-Testing
- •Questions for Self-Testing
- •7.4. The Itanium Processor Microarchitecture
- •Questions for Self-Testing
- •7.5. Deep Pipelining (10 stages)
- •Questions for Self-Testing
- •7.6. Efficient Instruction and Operand Delivery
- •Instruction bundles capable of full-bandwidth dispersal
- •Questions for Self-Testing
- •7.7. High ilp Execution Core
- •Questions for Self-Testing
- •7.8. The Itanium Organization
- •Implementation of cache hints
- •Questions for Self-Testing
- •7.9. Instruction-Level Parallelism
- •Questions for Self-Testing
- •7.10. Global Code Scheduler and Register Allocation
- •Questions for Self-Testing
- •8. Digital computers on the basic of vliw
- •Questions for Self-Testing
- •8.2. Synthesis of Parallelism and Scalability
- •Questions for Self-Testing
- •8.3. The majc Architecture
- •Questions for Self-Testing
- •8.4. Scit – Ukrainian Supercomputer Project
- •Questions for Self-Testing
- •8.5. Components of Cluster Supercomputer Architecture
- •Questions for Self-Testing
- •Questions for Self-Testing on Module IV
- •Problems for Self-Testing on Module IV
- •Conclusion
- •List of literature
- •Index and Used Abbreviations
- •03680. Київ-680, проспект Космонавта Комарова, 1.
Questions for Self-Testing
1. How is multiplication performed in a positional number system?
2. What are the peculiarities of fast multiplication algorithm realization?
3. What does the multiplication speed depend on?
4. What is the idea of the Booth algorithm?
5. Why, with appearing systems-on-chip, has the return to old methods of multiplication operation performing without using the method of increasing been become?
2.4. Integer Division
We discussed positive-number multiplication by relating the way in which the multiplication operation can be done on paper to the way a logic circuit can do it. It is instructive to follow the same strategy here. We will only discuss positive-number division in detail. Some general comments on the signed-operand case will be made later.
F
ig.
2.21 shows an example of decimal division along with the binary-coded
division of the same values. Consider the decimal version. The 2 in
the quotient is determined by the following reasoning. We first "try"
13 divided into 2 and it "doesn't go". Next we "try"
13 divided into 27. We go through the trial exercise of multiplying
13 by 2 to get 26, and knowing that 27-26 = 1 is less than 13, we
finally enter 2 as the quotient and perform the required subtraction.
The next digit of the dividend, 4, is "brought down" and we
finish by deciding that 13 goes into 14 once, with the remainder
being determined as 1. A similar discussion can be given for the
binary case, with the simplification that the only possibilities for
the quotient bits are 0 and 1.
T
he
important point to conclude from the above division "algorithm"
is that the process of trial determination of the quotient digits is
more difficult to automate in a logic circuit than the selection and
addition of summands in the multiplication case. The simplest circuit
that implements binary division must methodically position the
divisor with respect to the dividend and perform a subtraction. If
the remainder is zero or positive, a quotient bit of 1 is determined,
the remainder is extended by another bit of the dividend, the divisor
is repositioned, and another subtraction is performed. On the other
hand, if the remainder is negative, a quotient bit of 0 is
determined, the dividend is restored by adding back the divisor, and
the divisor is repositioned for another subtraction. Fig. 2.22 shows
a logic circuit arrangement that implements the above
restoring-division
technique. Note its similarity to the structure for multiplication
that was shown in Fig. 2.8. An n-bit
positive divisor is loaded into register M and an n-bit
positive dividend is loaded into register Q at the start of the
operation. Register A is set to 0. After the division is complete,
the n-bit
quotient will be in register Q with the remainder in A.
The required subtractions are facilitated by using 2's-complement arithmetic. The extra bit position at the left end of both A and M is for the sign bit for the subtractions. The following algorithm performs the division:
S1: DO n times
Shift A and Q left one binary position.
Subtract M from A, placing the answer back in A.
If the sign of A is 1, set q0 to 0 and add M back to A (restore A); otherwise, set q0 to 1.
Fig. 2.23 shows a 4-bit example as it would be processed by the circuit in Fig. 2.22. It is possible to improve on this algorithm by avoiding the need for restoring A after an unsuccessful subtraction. (We define a subtraction to be unsuccessful if the result is negative.) Consider the sequence of operations that takes place after the subtraction operation in the above algorithm. If A is positive, we shift it left and subtract M; that is, we perform 2A - M. If A is negative, we restore it by performing A + M, and then we shift it left and subtract M. This is equivalent to performing 2A + M. The q0 bit is appropriately set to 0 or 1 after the correct operation has been performed. We can summarize this into the following nonrestoring-division algorithm:
S1: DO n times
If the sign of A is 0, shift A and Q left one binary position and subtract M from A; otherwise, shift A and Q left and add M to A.
If the sign of A is 0, set q0 to 1; otherwise, set q0 to 0.
S2: If the sign of A is 1, add M to A.
The S2 step is needed to leave the proper positive remainder in A at the end of n cycles. It is clear that the logic circuitry of Fig. 2.22 can also be used to perform this algorithm as well as the original one. Note that the restore operations are not used and exactly one add or subtract operation is performed per cycle. Fig. 2.24 shows the division example of Fig. 2.23 is executed by the nonrestoring-division algorithm.
There are no simple algorithms for performing signed division that are comparable to the multiplication situation. In division, preprocessing of the operands and/or postprocessing of the results are usually required. These extra operations depend on the sign of the operands in conjunction with the desired signs of the results, and complementation of one or more numbers is usually involved. It is useful to note that we can always transform the operands to positive values, use one of the algorithms discussed above, and transform the results to the correct signed values as necessary.