Division, Square Root, Reciprocal, and Reciprocal Square Root Intrinsic
Functions........................................................................................................... 118
Sources of NaN, and Potential Changes of NaN Sign and Payload.......................118
Auto-Vectorizing Scalar Floating-Point Programs...................................................118
Compiler Switches Related to Floating-point Programs..........................................118
Domain Expertise, Vector Data-types, C Instrinsics, and Libraries........................ 120
6.4 Accuracy and Robustness Optimizations on ConnX BBE32EP DSP...................... 120
Some Old-timers’ Impression................................................................................. 120
Comparisons and Measurements against Infinitely Precise Results...................... 121
Depending on Rounding Mode vs. Not; Standards Conforming vs. Beyond.......... 121
Computer Algebra, Algorithm Selections, Default Substitutions, and Exception
Flags.................................................................................................................. 122
Error Bounds, Interval Arithmetic, and Rounding Mode......................................... 123
FMA, Split Number................................................................................................. 123
6.5 Implementing the Floating Point FFT/IFFT on ConnX BBE32EP DSP Vector
Floating Point Unit (VFPU)........................................................................................124
Radix-4 FFT implementation.................................................................................. 124
Intrinsic Code for Vectroized Radix-4 Butterfly and Twiddle Multiplication ..... 125
Optimizations of Implementation of IFFT.........................................................125
Performance of FFT on ConnX BBE32EP DSP-VFPU................................... 126
Twiddle Size.................................................................................................... 127
6.6 Floating Point References........................................................................................128
7 Special Operations............................................................................................................129
7.1 Polynomial Evaluation..............................................................................................130
7.2 Matrix Computation..................................................................................................131
7.3 Pairwise Real Multiply Operation............................................................................. 132
7.4 Descramble Operations........................................................................................... 133
7.5 Vector Compression and Expansion........................................................................134
7.6 Predicated Vector Operations.................................................................................. 135
8 Load & Store Operations...................................................................................................139
8.1 ConnX BBE32EP DSP Addressing Modes..............................................................140
8.2 Aligning Loads and Stores....................................................................................... 140
8.3 Circular Addressing..................................................................................................142
8.4 Variable Element Vector Aligning Loads and Stores................................................143
8.5 Update Post-increment in Loads and Stores........................................................... 143
9 Nature DSP Signal Library................................................................................................ 145
10 Implementation Methodology.......................................................................................... 147
10.1 Configuring a ConnX BBE32EP DSP.................................................................... 148
10.2 XPG Estimation for Size, Performance and Power................................................150
10.3 Basic ConnX BBE32EP DSP Characteristics........................................................ 150
10.4 Extending a ConnX BBE32EP DSP with User TIE................................................ 150
Compiling User TIE.................................................................................................152
v