Block Compressed Sparse Row (BSR)
In some sparse matrices, the sparsity pattern can efficiently be viewed as a sparse matrix of dense blocks. This may come from many places, for instance when a sparse system is constructed using discrete Galerkin finite elements where elements primarily interact with the nodal points on their own element, so you have a block structure of interactions. There are many other places where structure of the sparsity pattern includes blocks of stored elements, so an extension of the CSR matrix to store blocks of dense data and only storing the CSR structure data for the smaller block pattern, meaning smaller memory footprint and potentially improved performance when this format makes sense. This format is called the Block Compressed Row format, which we abbreviate as BSR for short. Another common abbreviation is BCSR, but it means the same thing and we will stick with BSR.
Like CSR, we will use the 3-array variant (so when relevant, we can also refer to it as BSR3 or 3-array BSR. The matrix in BSR3 format is represented by scalar sizes : blk_nrows, blk_ncols, row_blk_size, col_blk_size and blk_nnz as well as an index_base parameter and three arrays: bsr_row_ptr, bsr_col_ind and bsr_values as described in BSR Matrix Format Elements.
Note that we could equally store the full non-block sizes nrows and ncols instead of the block sizes blk_nrows and blk_ncols. We chose this approach since we want the matrix to be fully blocked without having to worry about partial blocks on the edges. This choice makes that more obvious.
BSR Matrix Format Elements |
Description |
|---|---|
blk_nrows |
Number of block rows in the sparse matrix. |
blk_ncols |
Number of block columns in the sparse matrix. |
row_blk_size |
Number of rows for the dense blocks in the matrix. |
col_blk_size |
Number of columns for the dense blocks in the matrix. |
blk_nnz |
Number of stored block elements (sometimes called number of non-zero blocks) in the sparse matrix. |
blk_layout |
Parameter that is used to specify whether the dense blocks are stored in row-major or column-major layout. |
index |
Parameter that is used to specify whether the matrix has zero or one-based indexing. |
bsr_values |
An array that contains the blk_nnz * row_blk_size * col_blk_size stored element values of the sparse matrix stored block row by block row using blk_layout layout. |
bsr_col_ind |
An integer array of blk_nnz column block indices for the stored (sometimes called non-zero) block elements stored in the bsr_values array, such that bsr_col_ind[i] is the column number (using zero- or one-based indexing) of the block element of the sparse matrix stored in bsr_values[i * row_blk_size * col_blk_size]. |
bsr_row_ptr |
An integer array of size equal to blk_nrows + 1. Element j of this integer array gives the position of the block element in the bsr_col_ind array that is the first non-zero block element in the block row j of A and when scaled by size of blocks, also gives the position in bsr_values for that block. In other words, bsr_col_ind[ bsr_row_ptr[j]-index ] * col_blk_size is the column index of the first stored element in j ‘th block row of A and bsr_values[ (bsr_row_ptr[j]-index) * row_blk_size * col_blk_size ] is the first stored value in that same block row. The last element of the bsr_row_ptr array (bsr_row_ptr[blk_nrows]) stores the sum of blk_nnz and index. That is, blk_nnz = bsr_row_ptr[blk_nrows] - index. |
One may note that there is no leading dimension for the blocks in the format definition, meaning that the leading dimension should be the same as the row_blk_size or col_blk_size depending on the blk_layout.
The 3-array BSR format that oneMKL supports has sorted block rows by definition. However, the block column indices within each block row may or may not be sorted (in ascending order) leading to a possibility of unsorted data. The BSR format (with no repeated column block indices that oneMKL currently requires) has a unique sorted form. However, in order to provide complete functional support for BSR matrices in either situation, oneMKL assumes BSR matrix handles to be unsorted by default. If the BSR format arrays are known to be in sorted form where all column block indices within each block row are sorted in ascending order, then users are strongly encouraged to provide that information to oneMKL through the sparse::set_matrix_property() API to possibly allow better algorithm choices and optimizations to deliver better performance. This is particularly important for APIs like triangular solve or sparse matrix addition.
A summary of sorted states for BSR format is given in the table below.
BSR Sortedness State |
Description |
Appropriate oneMKL sparse::property |
|---|---|---|
Unsorted |
Column block indices within each block row may not be ordered |
None (default) |
Sorted |
Sorted by block rows and sorted by block columns within each block row |
sparse::property::sorted |
Examples of BSR format
The following 3 examples show how the sparse BSR format can be used:
BSR Case 2: Sorted rectangular matrix with one-based indexing and an empty row
BSR Case 3: Unsorted rectangular matrix with zero-based indexing
BSR Case 1: Sorted square matrix with zero-based indexing
Assuming zero-based indexing and a real square matrix with square blocks.
Sorted BSR representation using 2x2 blocks in row-major layout and zero-based indexing:
blk_nrows |
4 |
|||||||||||
blk_ncols |
4 |
|||||||||||
blk_nnz |
7 |
|||||||||||
row_blk_size |
2 |
|||||||||||
col_blk_size |
2 |
|||||||||||
blk_layout |
RM |
|||||||||||
index |
0 |
|||||||||||
bsr_row_ptr |
0 |
2 |
4 |
6 |
7 |
|||||||
bsr_col_ind |
0 |
2 |
0 |
3 |
1 |
2 |
1 |
|||||
bsr_values |
1.2 |
-3.4 |
0.7 |
4.0 |
1.5 |
-3.8 |
2.6 |
-1.1 |
-0.9 |
2.2 |
3.7 |
-1.3 |
4.0 |
-2.7 |
1.8 |
-3.2 |
-1.4 |
2.9 |
3.1 |
-0.5 |
-3.6 |
0.8 |
2.3 |
-2.0 |
|
1.9 |
-2.4 |
-3.0 |
0.6 |
BSR Case 2: Sorted rectangular matrix with one-based indexing and an empty row
Assuming one-based indexing and real rectangular block matrix and an empty block row.
Notice that the blocks aren’t perfectly filled, so there are explicit zeros stored in the dense blocks in BSR format which might not be stored if it was stored in some other format like CSR or COO or CSC. We are free to choose the block sizes that make the most sense, in this case we could do 2x2 or 2x3 or 3x2 or 3x3, but it looks like 2x3 will have fewest explicit zeros, saving on memory footprint. It is important to use square blocks if we are performing a triangular-solve-like operation as that is all that will be supported, but for matrix multiplications, rectangular sizes are supported. We will choose 2x3 for our dense blocks for this example.
Sorted BSR representation with 2x3 blocks in row-major layout and one-based indexing:
blk_nrows |
3 |
|||||
blk_ncols |
2 |
|||||
blk_nnz |
2 |
|||||
row_blk_size |
2 |
|||||
col_blk_size |
3 |
|||||
blk_layout |
RM |
|||||
index |
1 |
|||||
bsr_row_ptr |
1 |
2 |
2 |
3 |
||
bsr_col_ind |
1 |
2 |
||||
bsr_values |
1.0 |
0.0 |
2.0 |
0.0 |
-1.0 |
4.0 |
0.0 |
2.0 |
0.0 |
-1.0 |
1.0 |
3.0 |
BSR Case 3: Unsorted rectangular matrix with zero-based indexing
Unsorted BSR example: Assuming zero-based indexing and a real rectangular matrix, we note that the BSR format does not require column block indices to be sorted within a given block row, but bsr_values and bsr_col_ind arrays must be consistent with each other. Having the sorted property is not necessary, but can lead to better performance in actual runs due to better algorithms and data locality being enabled. See sparse::set_matrix_property() and sparse::sort_matrix for more details on how one could set the sorted property when it is applicable or reorder the matrix to achieve it when desired.
We will use zero-based indexing and a 2x2 block size stored in column major layout for this unsorted BSR representation:
blk_nrows |
2 |
|||||||
blk_ncols |
3 |
|||||||
blk_nnz |
4 |
|||||||
row_blk_size |
2 |
|||||||
col_blk_size |
2 |
|||||||
blk_layout |
CM |
|||||||
index |
0 |
|||||||
bsr_row_ptr |
0 |
2 |
4 |
|||||
bsr_col_ind |
2 |
0 |
0 |
1 |
||||
bsr_values |
0.0 |
1.0 |
-1.0 |
0.5 |
1.0 |
0.0 |
0.0 |
-1.0 |
1.0 |
3.0 |
2.0 |
0.0 |
3.0 |
0.0 |
4.0 |
0.0 |