Visible to Intel only — GUID: hco1423076367035
Ixiasoft
1. Answers to Top FAQs
2. About DSP Builder for Intel® FPGAs
3. DSP Builder for Intel FPGAs Advanced Blockset Getting Started
4. DSP Builder Design Flow
5. Primitive Library Blocks Tutorial
6. IP Tutorial
7. DSP Builder for Intel FPGAs (Advanced Blockset) Design Examples and Reference Designs
8. DSP Builder Design Rules, Design Recommendations, and Troubleshooting
9. About DSP Builder for Intel FPGAs Optimization
10. About Folding
11. Floating-Point Data Types
12. Design Configuration Library
13. IP Library
14. Interfaces Library
15. Primitives Library
16. Utilities Library
17. Simulink Supported Blocks
18. Document Revision History for DSP Builder for Intel FPGAs (Advanced Blockset) Handbook
2.1. DSP Builder for Intel® FPGAs Features
2.2. DSP Builder for Intel® FPGAs Design Structure
2.3. DSP Builder for Intel® FPGAs Libraries
2.4. DSP Builder for Intel® FPGAs Device Support
2.5. FPGA Architecture Features for DSP Designs
2.6. DSP Design Flow in FPGAs
2.7. Software and Hardware DSP Design Flows in FPGAs
3.1. Installing DSP Builder for Intel® FPGAs
3.2. Licensing DSP Builder for Intel® FPGAs
3.3. Starting DSP Builder in MATLAB on Windows
3.4. Starting DSP Builder in MATLAB on Linux
3.5. Browsing DSP Builder Libraries and Adding Blocks to a New Model
3.6. Browsing and Opening DSP Builder Design Examples
3.7. Creating a New DSP Builder Design with the DSP Builder New Model Wizard
3.8. Simulating, Verifying, Generating, and Compiling Your DSP Builder Design
4.1. Implementing your Design in DSP Builder Advanced Blockset
4.2. Verifying your DSP Builder Advanced Blockset Design in Simulink and MATLAB
4.3. Exploring DSP Builder Advanced Blockset Design Tradeoffs
4.4. Verifying your DSP Builder Design with C++ Software Models
4.5. Verifying your DSP Builder Advanced Blockset Design in the ModelSim Simulator
4.6. Verifying Your DSP Builder Design in Hardware
4.7. Integrating Your DSP Builder Advanced Blockset Design into Hardware
4.1.2.1. DSP Builder Block Interface Signals
4.1.2.2. Periods
4.1.2.3. Sample Rate
4.1.2.4. Building Multichannel Systems
4.1.2.5. Channelization for Two Channels with a Folding Factor of 3
4.1.2.6. Channelization for Four Channels with a Folding Factor of 3
4.1.2.7. Synchronization and Scheduling of Data with the Channel Signal
4.1.2.8. Simulink vs Hardware Design Representations
4.2.1. Verifying your DSP Builder Advanced Blockset Design with a Testbench
4.2.2. Running DSP Builder Advanced Blockset Automatic Testbenches
4.2.3. Using DSP Builder Advanced Blockset References
4.2.4. Setting Up Stimulus in DSP Builder Advanced Blockset
4.2.5. Analyzing your DSP Builder Advanced Blockset Design
5.1. Creating a Fibonacci Design from the DSP Builder Primitive Library
5.2. Setting the Parameters on the Testbench Source Blocks
5.3. Simulating the Fibonacci Design in Simulink
5.4. Modifying the DSP Builder Fibonacci Design to Generate Vector Signals
5.5. Simulating the RTL of the Fibonacci Design
6.1. Creating an IP Design
6.2. Simulating the IP Design in Simulink
6.3. Viewing Timing Closure and Viewing Resource Utilization for the DSP Builder IP Design
6.4. Reparameterizing the DSP Builder FIR Filter to Double the Number of Channels
6.5. Doubling the Target Clock Rate for a DSP Builder IP Design
7.1. DSP Builder Design Configuration Block Design Examples
7.2. DSP Builder FFT Design Examples
7.3. DSP Builder DDC Design Example
7.4. DSP Builder Filter Design Examples
7.5. DSP Builder Finite State Machine Design Example
7.6. DSP Builder Folding Design Examples
7.7. DSP Builder Floating Point Design Examples
7.8. DSP Builder Flow Control Design Examples
7.9. DSP Builder HDL Import Design Example
7.10. DSP Builder Host Interface Design Examples
7.11. DSP Builder Fixed-Point Matrix Multiply Engine Design Example
7.12. DSP Builder Platform Design Examples
7.13. DSP Builder Primitive Block Design Examples
7.14. DSP Builder Reference Designs
7.15. DSP Builder Waveform Synthesis Design Examples
7.2.1. FFT
7.2.2. FFT without BitReverseCoreC Block
7.2.3. IFFT
7.2.4. IFFT without BitReverseCoreC Block
7.2.5. Floating-Point FFT
7.2.6. Floating-Point FFT without BitReverseCoreC Block
7.2.7. Floating-Point iFFT
7.2.8. Floating-Point iFFT without BitReverseCoreC Block
7.2.9. Multichannel FFT
7.2.10. Multiwire Transpose
7.2.11. Parallel FFT
7.2.12. Parallel Floating-Point FFT
7.2.13. Single-Wire Transpose
7.2.14. Switchable FFT/iFFT
7.2.15. Variable-Size Fixed-Point FFT
7.2.16. Variable-Size Fixed-Point FFT without BitReverseCoreC Block
7.2.17. Variable-Size Fixed-Point iFFT
7.2.18. Variable-Size Fixed-Point iFFT without BitReverseCoreC Block
7.2.19. Variable-Size Floating-Point FFT
7.2.20. Variable-Size Floating-Point FFT without BitReverseCoreC Block
7.2.21. Variable-Size Floating-Point iFFT
7.2.22. Variable-Size Floating-Point iFFT without BitReverseCoreC Block
7.2.23. Variable-Size Low-Resource FFT
7.2.24. Variable-Size Low-Resource Real-Time FFT
7.2.25. Variable-Size Supersampled FFT
7.4.1. Complex FIR Filter
7.4.2. Decimating CIC Filter
7.4.3. Decimating FIR Filter
7.4.4. Filter Chain with Forward Flow Control
7.4.5. FIR Filter with Exposed Bus
7.4.6. Fractional FIR Filter Chain
7.4.7. Fractional-Rate FIR Filter
7.4.8. Half-Band FIR Filter
7.4.9. IIR: Full-rate Fixed-point
7.4.10. IIR: Full-rate Floating-point
7.4.11. Interpolating CIC Filter
7.4.12. Interpolating FIR Filter
7.4.13. Interpolating FIR Filter with Multiple Coefficient Banks
7.4.14. Interpolating FIR Filter with Updating Coefficient Banks
7.4.15. Root-Raised Cosine FIR Filter
7.4.16. Single-Rate FIR Filter
7.4.17. Super-Sample Decimating FIR Filter
7.4.18. Super-Sample Fractional FIR Filter
7.4.19. Super-Sample Interpolating FIR Filter
7.4.20. Variable-Rate CIC Filter
7.7.1. Black-Scholes Floating Point
7.7.2. Double-Precision Real Floating-Point Matrix Multiply
7.7.3. Fine Doppler Estimator
7.7.4. Floating-Point Mandlebrot Set
7.7.5. General Real Matrix Multiply One Cycle Per Output
7.7.6. Newton Root Finding Tutorial Step 1—Iteration
7.7.7. Newton Root Finding Tutorial Step 2—Convergence
7.7.8. Newton Root Finding Tutorial Step 3—Valid
7.7.9. Newton Root Finding Tutorial Step 4—Control
7.7.10. Newton Root Finding Tutorial Step 5—Final
7.7.11. Normalizer
7.7.12. Single-Precision Complex Floating-Point Matrix Multiply
7.7.13. Single-Precision Real Floating-Point Matrix Multiply
7.7.14. Simple Nonadaptive 2D Beamformer
7.8.1. Avalon-ST Interface (Input and Output FIFO Buffer) with Backpressure
7.8.2. Avalon-ST Interface (Output FIFO Buffer) with Backpressure
7.8.3. Kronecker Tensor Product
7.8.4. Parallel Loops
7.8.5. Primitive FIR with Back Pressure
7.8.6. Primitive FIR with Forward Pressure
7.8.7. Primitive Systolic FIR with Forward Flow Control
7.8.8. Rectangular Nested Loop
7.8.9. Sequential Loops
7.8.10. Triangular Nested Loop
7.13.1. 8×8 Inverse Discrete Cosine Transform
7.13.2. Automatic Gain Control
7.13.3. Bit Combine for Boolean Vectors
7.13.4. Bit Extract for Boolean Vectors
7.13.5. Color Space Converter
7.13.6. CORDIC from Primitive Blocks
7.13.7. Digital Predistortion Forward Path
7.13.8. Fibonacci Series
7.13.9. Folded Vector Sort
7.13.10. Fractional Square Root Using CORDIC
7.13.11. Fixed-point Maths Functions
7.13.12. Gaussian Random Number Generator
7.13.13. Hello World
7.13.14. Hybrid Direct Form and Transpose Form FIR Filter
7.13.15. Loadable Counter
7.13.16. Matrix Initialization of LUT
7.13.17. Matrix Initialization of Vector Memories
7.13.18. Multichannel IIR Filter
7.13.19. Quadrature Amplitude Modulation
7.13.20. Reinterpret Cast for Bit Packing and Unpacking
7.13.21. Run-time Configurable Decimating and Interpolating Half-Rate FIR Filter
7.13.22. Square Root Using CORDIC
7.13.23. Test CORDIC Functions with the CORDIC Block
7.13.24. Uniform Random Number Generator
7.13.25. Vector Sort—Sequential
7.13.26. Vector Sort—Iterative
7.13.27. Vector Initialization of Sample Delay
7.13.28. Wide Single-Channel Accumulators
7.14.1. 1-Antenna WiMAX DDC
7.14.2. 2-Antenna WiMAX DDC
7.14.3. 1-Antenna WiMAX DUC
7.14.4. 2-Antenna WiMAX DUC
7.14.5. 4-Carrier, 2-Antenna W-CDMA DDC
7.14.6. 1-Carrier, 2-Antenna W-CDMA DDC
7.14.7. 4-Carrier, 2-Antenna W-CDMA DUC
7.14.8. 4-Carrier, 4-Antenna DUC and DDC for LTE
7.14.9. 1-Carrier, 2-Antenna W-CDMA DDC
7.14.10. 4-Carrier, 2-Antenna High-Speed W-CDMA DUC at 368.64 MHz with Total Rate Change 32
7.14.11. 4-Carrier, 2-Antenna High-Speed W-CDMA DUC at 368.64 MHz with Total Rate Change 48
7.14.12. 4-Carrier, 2-Antenna High-Speed W-CDMA DUC at 307.2 MHz with Total Rate Change 40
7.14.13. Cholesky-based Matrix Inversion
7.14.14. Cholesky Solver Multiple Channels
7.14.15. Crest Factor Reduction
7.14.16. Direct RF with Synthesizable Testbench
7.14.17. Dynamic Decimating FIR Filter
7.14.18. Multichannel QR Decompostion
7.14.19. QR Decompostion
7.14.20. QRD Solver
7.14.21. Reconfigurable Decimation Filter
7.14.22. Single-Channel 10-MHz LTE Transmitter
7.14.23. STAP Radar Forward and Backward Substitution
7.14.24. STAP Radar Steering Generation
7.14.25. STAP Radar QR Decomposition 192x204
7.14.26. Time Delay Beamformer
7.14.27. Transmit and Receive Modem
7.14.28. Variable Integer Rate Decimation Filter
9.1. Associating DSP Builder with MATLAB
9.2. Setting Up Simulink for DSP Builder Designs
9.3. The DSP Builder Windows Shortcut
9.4. Setting DSP Builder Design Parameters with MATLAB Scripts
9.5. Managing your Designs
9.6. How to Manage Latency
9.7. Flow Control in DSP Builder Designs
9.8. Reset Minimization
9.9. About Importing HDL
11.1. DSP Builder Floating-Point Data Type Features
11.2. DSP Builder Supported Floating-Point Data Types
11.3. DSP Builder Round-Off Errors
11.4. Trading Off Logic Utilization and Accuracy in DSP Builder Designs
11.5. Upgrading Pre v14.0 Designs
11.6. Floating-Point Sine Wave Generator Tutorial
11.7. Newton-Raphson Root Finding Tutorial
11.8. Forcing Soft Floating-point Data Types with the Advanced Options
13.1.1. DSP Builder FIR and CIC Filters
13.1.2. DSP Builder FIR Filters
13.1.3. Channel Viewer (ChanView)
13.1.4. Complex Mixer (ComplexMixer)
13.1.5. Decimating CIC
13.1.6. Decimating FIR
13.1.7. Fractional Rate FIR
13.1.8. Interpolating CIC
13.1.9. Interpolating FIR
13.1.10. NCO
13.1.11. Real Mixer (Mixer)
13.1.12. Scale
13.1.13. Single-Rate FIR
14.1.1. Bus Slave (BusSlave)
14.1.2. Bus Stimulus (BusStimulus)
14.1.3. Bus Stimulus File Reader (Bus StimulusFileReader)
14.1.4. External Memory, Memory Read, Memory Write
14.1.5. Register Bit (RegBit)
14.1.6. Register Field (RegField)
14.1.7. Register Out (RegOut)
14.1.8. Shared Memory (SharedMem)
15.3.1. About Pruning and Twiddle for FFT Blocks
15.3.2. Bit Vector Combine (BitVectorCombine)
15.3.3. Butterfly Unit (BFU)
15.3.4. Butterfly I C (BFIC) (Deprecated)
15.3.5. Butterfly II C (BFIIC) (Deprecated)
15.3.6. Choose Bits (ChooseBits)
15.3.7. Crossover Switch (XSwitch)
15.3.8. Dual Twiddle Memory (DualTwiddleMemoryC)
15.3.9. Edge Detect (EdgeDetect)
15.3.10. Floating-Point Twiddle Generator (TwiddleGenF) (Deprecated)
15.3.11. Fully-Parallel FFTs (FFT2P, FFT4P, FFT8P, FFT16P, FFT32P, and FFT64P)
15.3.12. Fully-Parallel FFTs with Flexible Ordering (FFT2X, FFT4X, FFT8X, FFT16X, FFT32X, and FFT64X)
15.3.13. General Multitwiddle and General Twiddle (GeneralMultiTwiddle, GeneralMultVTwiddle, GeneralTwiddle, GeneralVTwiddle)
15.3.14. Hybrid FFT (Hybrid_FFT, HybridVFFT, HybridVFFT_btb)
15.3.15. Multiwire Transpose (MultiwireTranspose)
15.3.16. Parallel Pipelined FFT (PFFT_Pipe)
15.3.17. Pulse Divider (PulseDivider)
15.3.18. Pulse Multiplier (PulseMultiplier)
15.3.19. Single-Wire Transpose (Transpose)
15.3.20. Split Scalar (SplitScalar)
15.3.21. Streaming FFTs (FFT2, FFT4, VFFT2, and VFFT4)
15.3.22. Stretch Pulse (StretchPulse)
15.3.23. Twiddle Angle (TwiddleAngle)
15.3.24. Twiddle Generator (TwiddleGenC) Deprecated
15.3.25. Twiddle and Variable Twiddle (Twiddle and VTwiddle)
15.3.26. Twiddle ROM (TwiddleRom, TwiddleMultRom and TwiddleRomF (deprecated))
15.4.1. Absolute Value (Abs)
15.4.2. Accumulator (Acc)
15.4.3. Add
15.4.4. Add SLoad (AddSLoad)
15.4.5. AddSub
15.4.6. AddSubFused
15.4.7. AND Gate (And)
15.4.8. Bit Combine (BitCombine)
15.4.9. Bit Extract (BitExtract)
15.4.10. Bit Reverse (BitReverse)
15.4.11. Compare (CmpCtrl)
15.4.12. Complex Conjugate (ComplexConjugate)
15.4.13. Compare Equality (CmpEQ)
15.4.14. Compare Greater Than (CmpGE)
15.4.15. Compare Less Than (CmpLT)
15.4.16. Compare Not Equal (CmpNE)
15.4.17. Constant (Const)
15.4.18. Constant Multiply (Const Mult)
15.4.19. Convert
15.4.20. CORDIC
15.4.21. Counter
15.4.22. Count Leading Zeros, Ones, or Sign Bits (CLZ)
15.4.23. Dual Memory (DualMem)
15.4.24. Demultiplexer (Demux)
15.4.25. Divide
15.4.26. Fanout
15.4.27. FIFO
15.4.28. Floating-point Classifier (FloatClass)
15.4.29. Floating-point Multiply Accumulate (MultAcc)
15.4.30. ForLoop
15.4.31. Load Exponent (LdExp)
15.4.32. Left Shift (LShift)
15.4.33. Loadable Counter (LoadableCounter)
15.4.34. Look-Up Table (Lut)
15.4.35. Loop
15.4.36. Math
15.4.37. Minimum and Maximum (MinMax)
15.4.38. MinMaxCtrl
15.4.39. Multiply (Mult)
15.4.40. Multiplexer (Mux)
15.4.41. NAND Gate (Nand)
15.4.42. Negate
15.4.43. NOR Gate (Nor)
15.4.44. NOT Gate (Not)
15.4.45. OR Gate (Or)
15.4.46. Polynomial
15.4.47. Ready
15.4.48. Reinterpret Cast (ReinterpretCast)
15.4.49. Round
15.4.50. Sample Delay (SampleDelay)
15.4.51. Scalar Product
15.4.52. Select
15.4.53. Sequence
15.4.54. Shift
15.4.55. Sqrt
15.4.56. Subtract (Sub)
15.4.57. Sum of Elements (SumOfElements)
15.4.58. Trig
15.4.59. XNOR Gate (Xnor)
15.4.60. XOR Gate (Xor)
15.6.1. Anchored Delay
15.6.2. Complex to Real-Imag
15.6.3. Enabled Delay Line
15.6.4. Enabled Feedback Delay
15.6.5. Expand Scalar (ExpandScalar)
15.6.6. Finite State Machine
15.6.7. Nested Loops (NestedLoop1, NestedLoop2, NestedLoop3)
15.6.8. Pause
15.6.9. Reset-Priority Latch (SRlatch_PS)
15.6.10. Same Data Type (SameDT)
15.6.11. Set-Priority Latch (SRlatch)
15.6.12. Single-Cycle Latency Latch (latch_1L)
15.6.13. Tapped Line Delay (TappedLineDelay)
15.6.14. Variable Super-Sample Delay (VariableDelay)
15.6.15. Vector Fanout (VectorFanout)
15.6.16. Vector Multiplexer (VectorMux)
15.6.17. Zero-Latency Latch (latch_0L)
Visible to Intel only — GUID: hco1423076367035
Ixiasoft
9.6. How to Manage Latency
The Primitive library blocks are untimed circuits, so they are not cycle accurate. A one-to-one mapping does not exist between the blocks in the Simulink model and the blocks you implement in your design in RTL. This decoupling of design intent from design implementation gives productivity benefits. The ChannelOut block is the boundary between the untimed section and the cycle accurate section. This block creates the additional delay that the RTL introduces, so that data going in to the ChannelOut block delays internally, before DSP Builder presents it externally. The latency of the block shows on the ChannelOut mask. You may want to fix or constrain the latency after you complete part of a DSP Builder design, for example on an IP library block or for a Primitive subsystem. In other cases, you may want to limit the latency in advance, which allows future changes to other subsystems without causing undesirable effects upon the overall design.
To accommodate extra latency, insert registers. This feature applies only to Primitive subsystems. To access, use the Synthesis Info block.
Latency is the number of delays in the valid signal across the subsystem. The DSP Builder advanced blockset balances delays in the valid and channel path with delays that DSP Builder inserts for autopipelining in the datapath.
Note: User-inserted sample delays in the datapath are part of the algorithm, rather than pipelining, and are not balanced. However, any uniform delays that you insert across the entire datapath optimize out. If you want to constrain the latency across the entire datapath, you can specify this latency constraint in the SynthesisInfo block.
- Reading the Added Latency Value for an IP Block
- Zero Latency Example
In this example, sufficient delays in the design ensure that DSP Builder requires no extra automatic pipelining to reach the fMAX target (although DSP Builder distributes this user-added delay through the datapath). - Implicit Delays in DSP Builder Designs
The DSP Builder scheduler may add extra delays on paths between the ChannelIn and ChannelOut blocks. The extra latency is the same for all such paths and is displayed on the ChannelOut block. - Distributed Delays in DSP Builder Designs
Distributed delays are not cycle-accurate inside a primitive subsystem, because DSP Builder distributes and optimizes the user-specified delay. To consistently apply extra latency to a primitive subsystem, use latency constraints. - Latency and fMAX Constraint Conflicts in DSP Builder Designs
Some blocks need to have a minimum latency, either because of logical or silicon limitations. In these cases, you can create an abstracted design that cannot be realized in hardware. - Control Units Delays
Commonly, you may use an FSM to design control units. An FSM uses DSP Builder SampleDelay blocks to store its internal state.