A.2. Matrix Multiplication Library

Intel® High Level Synthesis Compiler Pro Edition: Reference Manual

Download PDF

ID 683349

Date 6/02/2023

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: haz1522247698607

Ixiasoft

View Details

A.2. Matrix Multiplication Library

The matrix multiplication source code library provided with the Intel® HLS Compiler Pro Edition gives you an FPGA-optimized templatized source code library to perform matrix multiplication of two matrices stored in a 2-D array.

When you use the matrix multiplication library, you can affect the number of DSP blocks and RAM blocks by controlling the dot product vector size and the number of matrix elements read at one time. Increasing the dot product vector size can achieve better latency, but at the cost of using more DSP blocks and other FPGA resources.

Header File

To include the matrix multiplication library in your component, add the following line to your component:

#include "HLS/matrix_mult.h"

The header file is self-documented. You can review the header file to learn how to use the matrix multiplication library in your component.

Template Arguments

The matrix multiplication library multiplies two 2-D matrices, A and B. The resulting product is returned in a third matrix, C. The matrix multiplication library has the following template arguments:

T: The data type of the matrix elements (For example, int, float, long, double).
t_rowsA: The number of rows in matrix A.
t_colsA: The number of columns in matrix A. This value also the number of rows in matrix B.
t_colsB: The number of columns in matrix B.
DOT_VEC_SIZE: The number of DSP blocks to use in a single computation. This value must be a factor of t_colsA.
You can achieve better component latency by increasing this value. However, you use more FPGA area to achieve this. Keeping this value low lowers your FPGA resource usage, but increases the latency.
BLOCK_SIZE: The number of elements to read at one time from matrix A. The default value of BLOCK_SIZE is the value of DOT_VEC_SIZE. You can reduce this number if the bandwidth needed by matrix A is lower than the value of DOT_VEC_SIZE, but it must remain a factor of DOT_VEC_SIZE.
RUNNING_SUM_MULT_L: This parameter can be adjusted to try and improve the f_MAX of a component that uses this library. Review the header file for a detailed description of this argument and its effects.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® High Level Synthesis Compiler Pro Edition: Reference Manual

A.2. Matrix Multiplication Library

Header File

Template Arguments