A newer version of this document is available. Customers should click here to go to the newest version.
- 15.3.12. Fully-Parallel FFTs with Flexible Ordering (FFT2X, FFT4X, FFT8X, FFT16X, FFT32X, and FFT64X)
- 15.3.13. General Multitwiddle and General Twiddle (GeneralMultiTwiddle, GeneralMultVTwiddle, GeneralTwiddle, GeneralVTwiddle)
7.14.18. Multichannel QR Decompostion
To optimize the overall throughput the solver can interleave multiple data instances at the same time. The inputs of the design are system matrices A [n × m] and input vectors.
The reference design uses the Gram-Schmidt method to decompose system matrix A to Q and R matrices. It calculates the solution of the system by completing backward substitution.
The reference design is fully parametrizable: system dimensions n and m, the processing vector size, which defines the parallelization ratio of the dot product engine, and the number of channels that the design processes in parallel. This design uses single-precision Multiply and Add blocks that perform most of the floating-point calculations to implement a parallel dot product engine. The design uses a processor, which executes a fixed set of micro-instructions and generates operation indexes, to route different phases of the calculation through these blocks. The design uses for-loop macro blocks, which allow very efficient, flexible, and high-level implementation of iterative operations, to implement the processor.
The model file is demo_mcqrd.mdl.
Did you find the information on this page useful?