be the number of rows and
the number of columns in your processor grid (
). The work must be
within each processor column because vertical operations, such as pivoting or panel factorization, are synchronizing operations.
When there are two different types of nodes, use MPI to process all the faster nodes first and make sure the "
PMAP process mapping
" (line 9) of
is set to 1 for
mapping. Because all the nodes must be the same within a process column, the number of faster nodes must always be a multiple of
, and you can specify the faster nodes by setting the number of process columns
for the faster nodes with the
command-line parameter. The
-f 1.0 -c 0
setting corresponds to the default homogeneous behavior.