gemmr.generative_model._mk_Sigmaxy¶
-
gemmr.generative_model.
_mk_Sigmaxy
(assemble_Sigmaxy, Sigmaxx, Sigmayy, U, V, m, max_n_sigma_trials, qx, qy, rng, true_corrs, expl_var_ratio_thr=0.5, verbose=True)¶ Generate the between-set covariance matrix \(\Sigma_{XY}\) (i.e. the upper right block of the joint covariance matrix).
Random directions are chosen for the X and Y latent mode vectors with the constraints that
- the within-modality variance along these directions is at least expl_var_ratio_thr x the average variance along any dimension in this modality
- the resulting joint cross-modality covariance matrix must be positive definite
To increase chances of large within-modality variance for randomly chosen latent mode vectors they are calculated as a random linear combination of a random vector from the first q_x (for modality X, q_y for modality Y) modes and a random vector from the remaining modes.
If this is not successful, i.e. if no between-set weight vectors could be found that explain enough variance and result in a positive definite \(\Sigma_{XY}\), an optimization procedure (using differential evolution algorithm) is used to maximize the minimum eigenvalue of \(\Sigma_{XY}\). If that doesn’t succeed either, a ValueError is raised.
Parameters: - Sigmaxx (np.ndarray (n_X_features, n_X_features)) – covariance-matrix for modality X, i.e. the upper left block of the joint covariance matrix
- Sigmayy (np.ndarray (n_Y_features, n_Y_features)) – covariance-matrix for modality Y, i.e. lower right block of the joint covariance matrix
- U (np.ndarray (n_X_features, n_X_features)) – columns of U contain basis vectors for X data
- V (np.ndarray n_Y_features, n_Y_features)) – columns of V contain basis vectors for Y data
- m (int >= 1) – number of cross-modality modes to be encoded
- max_n_sigma_trials (int) – number of times an attempt is made to find latent mode vectors satisfying constraints
- qx (int) – latent mode vectors for modality X are calculated as a random linear combination of - a random linear combination of the first q_x columns of U - a random linear combination of the remaining columns of U
- qy (int) – latent mode vectors for modality Y are calculated as a random linear combination of - a random linear combination of the first q_y columns of V - a random linear combination of the remaining columns of V
- rng (random number generator instance) –
- true_corrs (np.ndarray (m,)) – cross-modality correlations that each latent mode should have
- expl_var_ratio_thr (float) – threshold for required within-modality variance along latent mode vectors
- verbose (bool) – whether to print status messages
Returns: - Sigmaxy (np.ndarray (n_X_features, n_Y_features)) – cross-modality covariance matrix
- Sigmaxy_svals (np.ndarray (m,)) – singular values of
Sigmaxy
, these are the true canonical correlations or covariances (for CCA or PLS, respectively) - true_corrs (np.ndarray (m,)) – the cross-modality covariances are calculated as the true correlations (given by input argument true_corrs times the variances along these directions. Should the resulting cross-modality covariances not be in descending order, they will be reordered, as will input argument true_corrs to reflect the change in order
- latent_expl_var_ratios_x (np.ndarray (m,)) – explained variance ratio in X modality along the latent directions
- latent_expl_var_ratios_y (np.ndarray (m,)) – explained variance ratio in Y modality along the latent directions
- U_ (np.ndarray (n_X_features, m)) – latent mode vectors for X
- V_ (np.ndarray (n_Y_features, m)) – latent mode vectors for Y
- cosine_sim_pc1_latentMode_x ((m,)) – cosine similarities between latent mode vectors and PC1 for X
- cosine_sim_pc1_latentMode_y ((m,)) – cosine similarities between latent mode vectors and PC1 for Y
- latent_mode_vector_algo (str) – ‘qr__’ or ‘opti’, algorithm with which the latent mode vectors were found
Raises: ValueError
– if no between-set weight vectors could be found that explain enough variance and result in a positive definite \(\Sigma_{XY}\)