ID 772991
Date 3/31/2023
Public

## Estimating a Pooled/Group Variance-Covariance Matrices/Means

Use the VSL_SS_METHOD_1PASS method to compute pooled/group variance-covariance matrices, or pooled/group means.

For the definition of pooled/group variance-covariance matrices, see the Mathematical Notation and Definitions chapter in the Summary Statistics section of [MKLMan].

To compute a pooled variance-covariance and/or a pooled mean, you need to split observations into g groups by allocating array grp_indices of size n, where n is the number of observations. Indices of the groups take values from the range [0,1, ... g-1]. Thus, grp_indices[j]= k if observation j belongs to the group indexed k.

The pooled variance-covariance matrix is packed as a one-dimensional array. For information on available storage formats and memory requirements, see Table Storage formats of a variance-covariance/correlation matrix of the Summary Statistics section of [MKLMan]. The pooled mean estimate is returned in the array that should store at least p elements, where p is the dimension of the task.

You can get estimates for group variance-covariance matrices and/or group means by passing into the library the array grp_cov_indices of size g. This array determines the group variance-covariance matrices and/or means to be returned:

1. If the group variance-covariance matrix and/or the vector of means are to be returned, grp_cov_indices[idx] = 1.

2. Otherwise, grp_cov_indices[idx] = 0.

The estimates for group variance-covariance matrices and group means are stored in one-dimensional arrays grp_cov and grp_means, respectively.

The group means are packed in the grp_means array in series. The size of the array should be sufficient for at least p*k elements,

where

1. p is the dimension of the task.

2. k is the number of group matrices to be returned.

Group matrices are packed in the grp_cov array in series according to the contents of the array grp_cov_indices. The size of the grp_cov array should be sufficient for at least cov_dim*k

where

1. cov_dim is the size of a single group matrix defined by the chosen storage format.

2. k is the number of group matrices to be returned.

The library checks that the initialization of the grp_indices pointer is correct and the values stored in the array are positive. If the initialization is wrong, computation of pooled/group variance-covariance matrix terminates with an error code. In this case, you need to make sure that the grp_indices array contains all values from 0 to g-1 inclusively, and the memory allocated for the grp_cov_indices array  is sufficient to hold at least g values.

The example below shows pooled/group variance-covariance matrices that you can get:

 #include "mkl_vsl.h"

#define DIM 3      /* dimension of the task */
#define N   1000   /* number of observations */
#define G   2      /* number of groups */
#define GN  2      /* number of group variance-covariance matrices */

int main()
{
int i;
double g_indices[N];           /* indices of the groups */
double x[N][DIM];              /* matrix of observations */
double g_cov_indices[G]={1,1}; /* two group matrices to be returned */

double pcov[DIM*DIM];          /* pooled variance-covariance matrix */
double pmean[DIM];             /* array of pooled means */

double gcov[DIM*DIM*GN];       /* array for group variance-covariance matrices */
double gmean[DIM*GN];          /* array for group means */
int status;

MKL_INT p, n, xstorage, pcovstorage, gcovstorage;
unsigned long long estimates;

/* Parameters of the task and initialization */
p = DIM;
n = N;
xstorage = VSL_SS_MATRIX_STORAGE_COLS;
pcovstorage = VSL_SS_MATRIX_STORAGE_FULL;
gcovstorage = VSL_SS_MATRIX_STORAGE_FULL;

/* The first N/2 elements belong to the first group, the rest belong to the second group */
for ( i = 0; i < N/2; i++ )
{
g_indices[i+0] = 0; g_indices[i+N/2] = 1;
}

/* Initialize the task parameters */
&pcovstorage );
&gcovstorage );
status = vsldSSEditPooledCovariance( task, g_indices, pmean,
pcov, g_cov_indices, gmean, gcov );

/* Compute the pooled and group variance-covariance matrices */
estimates = VSL_SS_POOLED_COV|VSL_SS_GROUP_COV;
status = vsldSSCompute( task, estimates, VSL_SS_METHOD_1PASS );

/* Deallocate the task resources */