gemmr.cca_sample_size

gemmr.cca_sample_size(X, Y, rs=(0.1, 0.3, 0.5), criterion='combined', algorithm='linear_model', target_power=0.9, target_error=0.1, data_home=None)

Suggest sample size for CCA.

Suggested sample sizes are estimated using a linear model to to inter- and extrapolate parameters for which the generative model was used beforehand to calculate sample sizes.

Parameters:
  • X (np.ndarray (n_samples, n_X_features) or int >= 2) – either a data matrix or directly the number of features for data matrix \(X\)
  • Y (np.ndarray (n_samples, n_Y_features) or int >= 2) – either a data matrix or directly the number of features for data matrix \(Y\)
  • rs (list-like) – true correlations for which sample sizes are estimated
  • criterion (str) –

    criterion according to which sample sizes are estimated. Can be:

    • 'combined'
    • 'power'
    • 'association_strength'
    • 'weight'
    • 'score'
    • 'loading'
    • 'crossloading'
  • algorithm (str) –

    algorithm used to calculate sample sizes. Can be:

    • 'linear_model'
  • target_power (float between o and 1) – if criterion is 'combined' or 'power' sample size is chosen to obtain at least target_power power
  • target_error (float between 0 and 1) – if criterion is not 'power' sample size is chosen to obtain at most target_error error in error metric(s)
  • data_home (None or str) – path where outcome data are stored, None indicates default path
Returns:

suggested_sample_sizes – suggested sample sizes for correlations rs

Return type:

dict