
- •Contents
- •1 Introduction
- •1.1 Objectives
- •1.2 Overview
- •2 Background
- •2.1 Digital Design for DSP Engineers
- •2.1.2 The Field-Programmable Gate Array
- •2.1.3 Arithmetic on FPGAs
- •2.2 DSP for Digital Designers
- •2.3 Computation Graphs
- •2.4 The Multiple Word-Length Paradigm
- •2.5 Summary
- •3 Peak Value Estimation
- •3.1 Analytic Peak Estimation
- •3.1.1 Linear Time-Invariant Systems
- •3.1.2 Data-range Propagation
- •3.2 Simulation-based Peak Estimation
- •3.3 Hybrid Techniques
- •3.4 Summary
- •4 Word-Length Optimization
- •4.1 Error Estimation
- •4.1.1 Word-Length Propagation and Conditioning
- •4.1.2 Linear Time-Invariant Systems
- •4.1.3 Extending to Nonlinear Systems
- •4.2 Area Models
- •4.3.1 Convexity and Monotonicity
- •4.4 Optimization Strategy 1: Heuristic Search
- •4.5 Optimization Strategy 2: Optimum Solutions
- •4.5.1 Word-Length Bounds
- •4.5.2 Adders
- •4.5.3 Forks
- •4.5.4 Gains and Delays
- •4.5.5 MILP Summary
- •4.6 Some Results
- •4.6.1 Linear Time-Invariant Systems
- •4.6.2 Nonlinear Systems
- •4.6.3 Limit-cycles in Multiple Word-Length Implementations
- •4.7 Summary
- •5 Saturation Arithmetic
- •5.1 Overview
- •5.2 Saturation Arithmetic Overheads
- •5.3 Preliminaries
- •5.4 Noise Model
- •5.4.1 Conditioning an Annotated Computation Graph
- •5.4.2 The Saturated Gaussian Distribution
- •5.4.3 Addition of Saturated Gaussians
- •5.4.4 Error Propagation
- •5.4.5 Reducing Bound Slackness
- •5.4.6 Error estimation results
- •5.5 Combined Optimization
- •5.6 Results and Discussion
- •5.6.1 Area Results
- •5.6.2 Clock frequency results
- •5.7 Summary
- •6 Scheduling and Resource Binding
- •6.1 Overview
- •6.2 Motivation and Problem Formulation
- •6.3 Optimum Solutions
- •6.3.1 Resources, Instances and Control Steps
- •6.3.2 ILP Formulation
- •6.4 A Heuristic Approach
- •6.4.1 Overview
- •6.4.2 Word-Length Compatibility Graph
- •6.4.3 Resource Bounds
- •6.4.4 Latency Bounds
- •6.4.5 Scheduling with Incomplete Word-Length Information
- •6.4.6 Combined Binding and Word-Length Selection
- •6.5 Some Results
- •6.6 Summary
- •7 Conclusion
- •7.1 Summary
- •7.2 Future Work
- •A.1 Sets and functions
- •A.2 Vectors and Matrices
- •A.3 Graphs
- •A.4 Miscellaneous
- •A.5 Pseudo-Code
- •References
- •Index

110 |
5 |
Saturation Arithmetic |
|
|
|
|
|
|
|
|
||
|
|
|
Speed / Area design space for 4th order narrow bandpass elliptic IIR filter |
|
||||||||
/ s |
0.18 |
|
|
0.18 |
|
|
|
0.18 |
|
|
|
|
0.16 |
|
|
0.16 |
|
|
|
0.16 |
|
|
|
|
|
clk0.14 |
|
|
0.14 |
|
|
|
0.14 |
|
|
|
|
|
T |
0.12 |
|
|
0.12 |
|
|
|
0.12 |
|
|
|
|
min |
|
|
|
|
|
|
|
|
|
|||
0.1 |
|
|
0.1 |
|
|
|
0.1 |
|
|
|
|
|
|
0.08 |
|
|
0.08 |
|
|
|
0.08 |
|
|
|
|
|
700 |
800 |
900 |
1000 1100 |
700 |
800 |
900 |
1000 1100 |
700 |
800 |
900 |
1000 1100 |
/ s |
0.18 |
|
|
0.18 |
|
|
|
0.18 |
|
|
|
|
0.16 |
|
|
0.16 |
|
|
|
0.16 |
|
|
|
|
|
clk0.14 |
|
|
0.14 |
|
|
|
0.14 |
|
|
|
|
|
T |
0.12 |
|
|
0.12 |
|
|
|
0.12 |
|
|
|
|
min |
|
|
|
|
|
|
|
|
|
|||
0.1 |
|
|
0.1 |
|
|
|
0.1 |
|
|
|
|
|
|
0.08 |
|
|
0.08 |
|
|
|
0.08 |
|
|
|
|
|
700 |
800 |
900 |
1000 1100 |
700 |
800 |
900 |
1000 1100 |
700 |
800 |
900 |
1000 1100 |
/ s |
0.18 |
|
|
0.18 |
|
|
|
0.18 |
|
|
|
|
0.16 |
|
|
0.16 |
|
|
|
0.16 |
|
|
|
|
|
clk0.14 |
|
|
0.14 |
|
|
|
0.14 |
|
|
|
|
|
T |
0.12 |
|
|
0.12 |
|
|
|
0.12 |
|
|
|
|
min |
|
|
|
|
|
|
|
|
|
|||
0.1 |
|
|
0.1 |
|
|
|
0.1 |
|
|
|
|
|
|
0.08 |
|
|
0.08 |
|
|
|
0.08 |
|
|
|
|
|
700 |
800 |
900 |
1000 1100 |
700 |
800 |
900 |
1000 1100 |
700 |
800 |
900 |
1000 1100 |
/ s |
0.18 |
|
|
|
|
|
|
(a) comb optimized |
|
|
|
|
0.16 |
|
|
|
|
|
|
(b) uniform / l |
|
|
|
|
|
clk0.14 |
|
|
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
|
|
(c) uniform / sim |
|
|
|
|||
T |
0.12 |
|
|
|
|
|
|
|
|
|
||
min |
|
|
|
|
|
|
(d) wl optimized / l1 |
|
|
|
||
0.1 |
|
|
|
|
|
|
(e) wl optimized / sim |
|
|
|
||
|
0.08 |
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
700 |
800 |
900 |
1000 1100 |
|
|
|
|
|
|
|
|
|
|
area / LCs |
|
|
|
|
|
|
|
|
Fig. 5.19. Alternative design approaches, and their speed / area design-space locations
5.7 Summary
This chapter has presented a novel technique for design automation of saturation arithmetic systems. An analytic saturation noise estimation method has been presented, based on the introduction of the saturated Gaussian distribution and a linearization of the saturation nonlinearities. In contrast to truncation and rounding, autoand cross-correlations between linearized saturation nonlinearities have been accounted for using a bound derived through the Cauchy-Schwartz inequality. Techniques have been presented to reduce the slackness associated with such a bound.
The heuristic presented in Section 4.4 has been extended to incorporate combined scaling and word-length optimization. The results of such an optimization have been discussed for real examples of DSP systems and contrasted with more traditional approaches to scaling optimization. It has been shown that allowing rare saturation errors can result in fast and small implementations of IIR filters when the poles of the filter are close to the unit circle. Improvements have been achieved of up to 8% in area and 24% in speed over and above the improvements generated through the techniques of Chapter 4.

5.7 Summary |
111 |
+ z-1 + |
+ z-1 + |
+ z-1 + |
+ z-1 + |
(a) scaling determined by simulation
+ z-1 + |
+ z-1 + |
+ z-1 + |
+ z-1 + |
(b) scaling determined by combined optimization
Saturator, degree 1
Saturator, degree 2
Saturator, degree 3
Fig. 5.20. Saturator locations and degrees for the 4th order narrow bandpass elliptic IIR filter
This page intentionally left blank